Title 



DNA Molecules and Polypeptides of Pseudomonas 
syringae Hrp Pathogenicity Island and Their Uses 



Inventors: Alan Collmer, James R. Alfano, and Amy O. 

Charkowski 



Docket No.: 19603/3243 (CRF D-2601C) 



R478525.2 



- 1 - 

DNA Molecules and Polypeptides of Pseudomonas syringae 

Hrp Pathogenicity Island and Their Uses 

5 This application claims benefit of U.S. Provisional Patent Application 

Serial Nos. 60/194,160, filed April 3, 2000, 60/224,604, filed August 11, 2000, and 
60/249,548, filed November 17, 2000, which are hereby incorporated by reference in 
their entirety. 

This work was supported by National Science Foundation Grant 
10 No. MCB-9631530 and National Research Initiative Competitive Grants Program, 
U.S. Department of Agriculture, Grant No. 98-35303-4488. The U.S. Government 
may have certain rights in this invention. 

Field of the Invention 

1 5 The present invention relates to isolated DNA molecules 

corresponding to the open reading frames in the conserved effector loci and 
exchangeable effector loci of the Pseudomonas syringae, the isolated proteins 
encoded thereby, and their various uses. 

20 Background of the Invention 

The plant pathogenic bacterium Pseudomonas syringae is noted for its 
diverse and host-specific interactions with plants (Hirano and Upper, 1990). A 
specific strain may be assigned to one of at least 40 pathovars based on its host range 
among different plant species and then further assigned to a race based on differential 

25 interactions among cultivars of the host. In host plants the bacteria typically grow to 
high population levels in leaf intercellular spaces and then produce necrotic lesions. 
In nonhost plants or in host plants with race-specific resistance, the bacteria elicit the 
hypersensitive response (HR), a rapid, defense-associated programmed death of plant 
cells in contact with the pathogen (Alfano and Collmer, 1997). The ability to produce 

30 either of these reactions in plants appears to be directed by hrp (HR and 

pathogenicity) and hrc (HR and conserved) genes that encode a type III protein 
secretion pathway and by avr (avirulence) and hop (Hip-dependent outer protein) 
genes that encode effector proteins injected into plant cells by the pathway (Alfano 
and Collmer, 1997). These effectors may also betray the parasite to the HR-triggering 



i?-gene surveillance system of potential hosts (hence the avr designation), and plant 
breeding for resistance based on such gene-for-gene (avr-R) interactions may produce 
complex combinations of races and differential cultivars (Keen, 1990). hrplhrc genes 
are probably universal among necrosis-causing gram-negative plant pathogens, and 
they have been sequenced in P. syringae pv. syringae (Psy) 61 , Erwinia amylovora 
Ea321 , Xanthomonas campestris pv. vesicatoria (Xcv) 85-10, 3nd Ralstonia 
solanacearum GMI1000 (Alfano and Collmer, 1997). Based on their distinct gene 
arrangements and regulatory components, the hrplhrc gene clusters of these four 
bacteria can be divided into two groups: I (Pseudomonas and Erwinia) and II 
{Xanthomonas and Ralstonia), The discrepancy between the distribution of these 
groups and the phylogeny of the bacteria provides some evidence that hrplhrc gene 
clusters have been horizontally acquired and, therefore, may represent pathogenicity 
islands (Pais) (Alfano and Collmer, 1997). 

Pais have been defined as gene clusters that (i) include many virulence 
genes, (ii) are selectively present in pathogenic strains, (iii) have different G+C 
content compared to host bacteria DNA, (iv) occupy large chromosomal regions, 
(v) are often flanked by direct repeats, (vi) are bordered by tRNA genes and/or cryptic 
mobile genetic elements, and (vii) are unstable (Hacker et al., 1997). Some Pais have 
inserted into different genomic locations in the same species (Wieler et al., 1997). 
Others reveal a mosaic structure indicative of multiple horizontal acquisitions (Hensel 
et al., 1999). Genes encoding type III secretion systems are present in Pais in animal 
pathogenic Salmonella spp. and Pseudomonas aeruginosa and on large plasmids in 
Yersinia and Shigella spp. Genes encoding effectors secreted by the pathway in these 
organisms are commonly linked to the pathway genes (Hueck, 1998), although a 
noteworthy exception is sopE, which is carried by a temperate phage without apparent 
linkage to SPI1 in certain isolates of S. typhimurium (Mirold et al., 1999). Three 
avrlhop genes have already been shown to be linked to the hrplhrc cluster in P. 
syringae: avrE and several other Hrp-regulated transcriptional units are linked to the 
hrpR border of the hrp cluster in P. syringae pv tomato (Pto) DC3000 (Lorang and 
Keen, 1995); avrPphE is adjacent to hrpY (hrpK) in Pseudomonas phaseolicola (Pph) 
1302A (Mansfield et al., 1994); and hopPsyA (hrmA) is adjacent to hrpK in Psy 61 
(Heu and Hutcheson, 1993). Other Pseudomonas avr genes are located elsewhere in 



the genome or on plasmids (Leach and White, 1996), including a plasmid-borne group 
of avr genes described as a Pai in Pph 1449B (Jackson et al., 1999). 

Because Avr, Hop, Hrp, and Hrc proteins represent promising 
therapeutic treatments in both plants and animals, it would be desirable to identify 
other proteins encoded by the Pai's in pathogenic bacteria and identify uses for those 
proteins. 

The present invention overcomes these deficiencies in the art. 

Summary of the Invention 

One aspect of the present invention relates to isolated nucleic acid 
molecules (i) encoding proteins or polypeptides of Pseudomonas Conserved Effector 
Loci ("CEL") and Exchangeable Effector Loci ("EEL") genomic regions, (ii) nucleic 
acid molecules which hybridize thereto under stringent conditions, or (iii) nucleic acid 
molecules that include a nucleotide sequence which is complementary to the nucleic 
acid molecules of (i) and (ii). Expression vectors, host cells, and transgenic plants 
which include the DNA molecules of the present invention are also disclosed. 
Methods of making such host cells and transgenic plant are disclosed. 

A further aspect of the present invention relates to isolated proteins or 
polypeptides encoded by the nucleic acid molecules of the present invention. 
Compositions which contain the proteins are also disclosed. 

Yet another aspect of the present invention relates to methods of 
imparting disease resistance to a plant. According to one approach, this method is 
carried out by transforming a plant cell with a heterologous DNA molecule of the 
present invention and regenerating a transgenic plant from the transformed plant cell, 
wherein the transgenic plant expresses the heterologous DNA molecule under 
conditions effective to impart disease resistance. According to another approach, this 
method is carried out by treating a plant with a protein or polypeptide of the present 
invention under conditions effective to impart disease resistance to the treated plant. 

A still further aspect of the present invention relates to a method of 
making a plant hypersusceptible to colonization by nonpathogenic bacteria. 
According to one approach, this method is carried out by transforming a plant cell 
with a heterologous DNA molecule of the present invention and regenerating a 



transgenic plant from the transformed plant cell, wherein the transgenic plant 
expresses the heterologous DNA molecule under conditions effective to render the 
transgenic plant hypersusceptible to colonization by nonpathogenic bacteria. 
According to an alternative approach, this method is carried out by treating a plant 
with a protein or polypeptide of the present invention under conditions effective to 
render the treated plant susceptible to colonization by nonpathogenic bacteria. 

Another aspect of the present invention relates to a method of causing 
eukaryotic cell death by introducing into a eukaryotic cell a cytotoxic Pseudomonas 
protein, where the introducing is performed under conditions effective to cause cell 
death. 

A further aspect of the present invention relates to a method of treating 
a cancerous condition by introducing a cytotoxic Pseudomonas protein into cancer 
cells of a patient under conditions effective to cause death of cancer cells, thereby 
treating the cancerous condition. 

The benefits of the present invention result from three factors. First, 
there is substantial and growing evidence that phytopathogen effector proteins have 
evolved to elicit exquisite changes in eukaryote metabolism at extremely low levels, 
and at least some of these activities are potentially relevant to mammals and other 
organisms in addition to plants. For example, ORF5 in the Psy B728a EEL is similar 
to Xanthomonas campestris pv. vesicatoria AvrBsT, a phytopathogen protein that 
appears to have the same active site as its animal pathogen homolog YopJ, which 
inhibits mammalian MAPKK defense signaling (Orth et al., 2000). Second, the 
P. syringae CEL and EEL regions are enriched in effector protein genes, which makes 
these regions fertile targets for effector gene bioprospecting. Third, rapidly 
developing technologies for delivering genes and proteins into plant and animal cells 
improve the efficacy of protein-based therapies. 

Brief Description of the Drawings 

Figure 1 is a diagram illustrating the conserved arrangement of hrp/hrc 
genes within the Hrp Pais of Psy 61, Psy B728a, and Pto DC3000. Regions 
sequenced in B728a and DC3000 are indicated by lines beneath the strain 61 
sequence. Known regulatory genes are shaded. Arrows indicate the direction of 



transcription, with small boxes denoting the presence of a Hrp box. The triangle 
denotes the 3.6-kb insert with phage genes in the B728a hrplhrc region. 

Figures 2A-C show the EEL of Pto DC3000, Psy B728a, and Psy 61, 
the tgt-queA-iRNA 1 *" locus in P. aeruginosa (Pa)> and EEL border sequences. Figure 
2A is a diagram of the EELs of three P. syringae strains shown aligned by their hrpK 
sequences and are compared with the tgt-queA-iKNA 1 *" locus in Pa PA01. Arrows 
indicate the direction of transcription, with small boxes denoting the presence of a 
Hrp box. Shaded regions are conserved, striped regions denote mobile genetic 
elements, and open boxes denote genes that are completely dissimilar from each 
other. Figure 2B is an alignment of the sequences of the DC3000 (DC) (SEQ. ID. No. 
85), B728a (B7) (SEQ. ID. No. 86), and 61 (SEQ. ID. No. 87) EELs at the border 
with tRNA^", with conserved nucleotides shown in upper case. Figure 2C is an 
alignment of the sequences of the DC3000 (DC) (SEQ. ID. No. 88), B728a (B7) 
(SEQ. ID. No. 89), and 61 (SEQ. ID. No. 90) EELs at the border with hrpK, with 
conserved nucleotides shown in upper case. 

Figure 3 is a diagram illustrating the Hrp Pai CEL of P. syringae. The 
Pto DC3000 CEL is shown with the corresponding fragments of Psy B728a that were 
sequenced aligned below. The nucleotide identity of the sequenced fragments in 
coding regions ranged from 72% to 83%. Arrows indicate the direction of 
transcription, with small boxes denoting the presence of a Hrp box. 

Figures 4A-E illustrate the plant interaction phenotypes of Pto mutants 
carrying deletions of the EEL (CUCPB5 1 10) and CEL (CUCPB5 1 1 5). Figure 14A is 
a graph illustrating growth in tomato of DC3000 and CUCPB51 10 (mean and SD). 
Figure 14B is a graph illustrating growth in tomato of DC3000, CUCPB51 15, and 
CUCPB5 1 1 5(pCPP301 6) (mean and SD). Figure 14C is an image showing HR 
collapse in tobacco leaf tissue 24 h after infiltration with 10 cfu/ml of DC3000 and 
CUCPB51 15. Figure 14D is an image showing the absence of disease symptoms in 
tomato leaf 4 days after inoculation with 10 4 cfu/ml of CUCPB51 15. Figure 14E is 
an image showing disease symptoms typical of wild-type in tomato leaf 4 days after 
inoculation with 10 4 cfu/ml of CUCPB51 15(pCPP3016). 

Figure 5 is an image of the immunoblot analysis showing AvrPto 
secretion by Pto DC3000 derivatives with deletions affecting the three major regions 



of the Hrp Pai. Bacteria were grown in Hrp-inducing minimal medium at pH 5.5 and 
22°C to an ODeoo of 0.35 and then separated into cell-bound (C) and supernatant (S) 
fractions by centrifugation. Proteins were then resolved by SDS-PAGE, blotted, and 
immunostained with antibodies against AvrPto and p-lactamase as described 
(Manceau and Harvais, 1997), except that supernatant fractions were concentrated 3- 
fold relative to cell-bound fractions before loading. Pto DC3000, CUCPB51 15 (CEL 
deletion), CUCPB51 14 {hrplhrc deletion), and CUCPB51 10 (EEL deletion) all 
carried pCPP2318, which expresses p-lactamase without a signal peptide as a 

cytoplasmic marker. 

Figures 6 A-B illustrate, enlarged as compared to Figure 1 , the 
organization of the shcA and hopPsyA operon in the EEL of the Hrp Pai of Psy 61 . In 
Figure 6 A, the shcA and hopPsyA are depicted as white boxes. At the border of the 
Hrp Pai are the tRNA Leu and queA genes depicted as gray boxes. A 5' truncated hrpK 
gene is represented as a hatched box. The arrows indicate the predicted direction of 
transcription and the black box denotes the presence of a putative HrpL-dependent 
promoter upstream of shcA. Figure 6B illustrates schematically the construction of 
the deletion mutation in the shcA ORF marker-exchanged into Psy 61. Black bars 
depict regions that were amplified along with added restriction enzyme sites and each 
are aligned with the corresponding DNA region represented in Figure 6A. The striped 
box depicts the nptll cassette that lacks transcriptional and translational terminators 
used in making the functionally nonpolar shcA Psy 61 mutant. EcoRI, E; EcoRV, V; 
Xba\ X; and Xhol, Xh. 

Figure 7 is an image of an immunoblot showing that shcA encodes a 
protein product. pLV9 is a derivative of pFLAG-CTC in which the shcA ORF is 
cloned and fused to the FLAG epitope and translation is directed by a vector ribosome 
binding site (RBS). pLV26 contains an amplified product containing the shcA coding 
region and its native RBS site. Cultures of E. coli DH5ot carrying either pFLAG-CTC 
(Control), pLV9, or pLV26 were grown to an OD6ooof 0.8 and then 100 jj! aliquots 
were taken, centrifuged, resuspended in SDS-PAGE buffer, and then subjected to 
SDS-PAGE and immunoblot analysis with anti-FLAG antibodies and secondary 
antibodies conjugated with alkaline phosphatase. 



Figure 8 is an image of an immunoblot showing that Psy 61 shcA 
mutant UNLV102 does not secrete HopPsyA and shcA provided in trans 
complements this defect. Psy 61 cultures were grown at 22°C in A/77-derepressing 
medium and separated into cell-bound (C) and supernatant fractions (S). The cell- 
bound fractions were concentrated 13.4-fold and the supernatant fractions were 
concentrated 100-fold relative to the initial culture volumes. The samples were 
subjected to SDS-PAGE and immunoblot analysis, and HopPsyA and p-lactamase 
(Bla) were detected with either anti-HopPsyA or anti- P-lactamase antibodies followed 
by secondary antibodies conjugated to alkaline phosphatase as described in the 
experimental procedures. The image of the immunoblot was captured using the Bio- 
Rad Gel Doc 2000 UV fluorescent gel documentation system with the accompanying 
Quantity 1 software. 

Figure 9 is an image of an immunoblot showing that shcA is required 
for the type III secretion of HopPsyA, but not secretion of HrpZ. P. fluorescens 55 
cultures were grown in A/77-derepressing medium and separated into cell-bound (C) 
and supernatant (S) fractions. The cell-bound fractions were concentrated 13.4-fold 
and the supernatant fractions were concentrated 100-fold relative to the initial culture 
volumes. The samples were subjected to SDS-PAGE and immunoblot analysis, and 
HopPsyA and HrpZ were detected with either anti-HopPsyA or anti-HrpZ antibodies 
followed by secondary antibodies conjugated to alkaline phosphatase as described in 
experimental procedures. The image of the immunoblot was captured using the Bio- 
Rad Gel Doc 2000 UV fluorescent gel documentation system with the accompanying 
Quantity 1 software. 

Figure 10 is a series of four images of tobacco leaves showing that P. 
fluorescens 55 carrying a pHIRl 1 derivative with a functionally nonpolar shcA 
mutation is impaired in its ability to translocate HopPsyA into plant cells. P. 
fluorescens 55 cultures were grown overnight in King's B and suspended in 5 mM 
MES pH 5.6 to an ODeoo of 1.0, and infiltrated into tobacco leaf panels. Because the 
pHIRl 1 -induced HR is due to the translocation of HopPsyA inside plant cells, a 
reduced HR indicates that HopPsyA is not delivered well enough to induce a typical 
HR. The leaf panels were photographed with incident light 24 hours later. 
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Figure 1 1 is an image of an immunoblot showing that ShcA binds to 
HopPsyA. Soluble protein samples from sonicated cultures (Sonicate) of Psy 61 shcA 
mutant UNLV102 carrying pLNl (HopPsyA) or pLN2 (ShcA-FLAG, HopPsyA) were 
mixed with anti-FLAG M2 affinity gel (Gel). The gel was washed (Wash) with TBS 
5 buffer, mixed with SDS-PAGE buffer, and subjected to SDS-PAGE and immunoblot 
analysis along with the sonicate and wash samples. HopPsyA and ShcA-FLAG were 
detected with anti-HopPsyA or anti-FLAG antibodies followed by secondary 
antibodies conjugated to alkaline phosphatase as described in experimental 
procedures. 

10 Figure 12 is a diagram illustrating the spindle checkpoint in S. 

cerevisiae. The spindle checkpoint is activated by a signal emitted from the 
kinetochores when there are abnormalities with the microtubules. This signal is 
somehow received by the spindle checkpoint components, which respond in a variety 
of ways. Mad2 is thought to bind to Cdc20 at the APC inhibiting its ubiquitin ligase 

15 activity. In the absence of Mad2 (and presumably damage to the spindle), the APC is 
active and it marks Pdsl and other inhibitors of anaphase for degradation via the 
ubiquitin proteolysis pathway; anaphase ensues. 

Figures 13A-B illustrate the effects of transgenically expressed 
HopPsyA on Nicotiana tabacum cv. Xanthi, Nicotiana benthamiana, and 

20 Arabidopsis thaliana. Figure 13A shows N. tabacum cv. Xanthi and N. benthamiana 
leaves infiltrated with Agrobacterium twnefaciens GV3101 with or without 
pTA1002\:hopPsyA. Figure 13B illustrates Arabidopsis thaliana Col-1 infiltrated 
with A. tumefaciens +/- pTA7QQ2::hopPsyA. For all plants shown in Figures 13A-B, 
48 h after Agrobacterium infiltration, plants were sprayed with the glucocorticoid 

25 dexamethasone (DEX). Images were collected 24 h after DEX treatment. A.t. = 
Agrobacterium tumefaciens; pA = pTA7002::hopPsyA. 

Figure 14 is an image of an SDS-PAGE which shows the distribution 
of HopPsyA and P-lactamase in cultures of Psy 61 (pCPP2318) or a hrp mutant, Psy 
61-2089 (pCPP2318). Bacterial cultures were grown at 22°C in /irp-depressing 

30 medium and separated into cell-bound (C) and supernatant fractions (S). The cell- 
bound fractions were concentrated 13.4 fold, and the supernatant fractions were 
concentrated 100 fold relative to initial culture volumes. The samples were subjected 
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to SDS-PAGE and immunoblot analysis and HopPsyA and p-lactamase were detected 
with either anti- HopPsyA or anti-p-lactamase antibodies followed by secondary 
antibodies conjugated to alkaline phosphatase. Pss wild-type = Pseudomonas 
syringae pv. syringae 61 (pCPP2318); Pss hrcC = Pseudomonas syringae pv. 
5 syringae 6 1 -2089 (pCPP23 1 8). 

Figure 1 5 is a graph illustrating the ability of wild-type Pseudomonas 
syringae pv. syringae and a hopPsyA mutant to multiply in bean leaves. Values 
represent the average plate counts from crushed plant leaves of two independent 
inoculations. Wild-type (•), Pseudomonas syringae pv. syringae 61 ; hopPsyA mutant 
10 (O), Pseudomonas syringae pv. syringae 61-2070. 

Figures 16A-B illustrate the interaction of HopPsyA and Mad2 in a 
yeast two-hybrid assay. Figure 1 6A illustrates cultures of yeast EGY48 strains 
containing either pLV24 (pEG202:: 'hopPsyA) and pJG4-5 (fish-vector), pLV24 and 
pLVl 16 (pJG4-5 ::mad2) 9 or pEG202 (bait vector) and pLVl 16 on medium containing 
15 5-bromo-4-chloro-3-indolyl-P-D-galactopyranoside (Xgal) to check for P- 

galactosidase activity with either glucose (Glc) or galactose (Gal). P-galactosidase 
activity was indicated only in the presence of both HopPsyA and Mad2. Figure 16B 
illustrates cultures of the same yeast strains on minimal medium leucine dropout 
plates with either Glc or Gal sugars. 1 = EGY48 (pLV24, pJG4-5); 2 = EGY48 
20 (pLV24, pLVl 16); 3 = EGY48 (pEG202, pLVl 16). 

Detailed Description of the Invention 

A DNA molecule which contains the CEL of Pseudomonas syringae 
pv. tomato DC3000 has a nucleotide sequence (SEQ. ID. No. 1) as follows: 

25 

ggtaccgggc tctgtgacgc agagcgtcac gcaaggcatt ccactggagc gtgaggaacg 60 
ataatcctga cgacaactat cgtgcgacgc tccgcgtcgg catgccgttc tggacgctct 120 
gcgtcctgtc ttgagaggtg cgccaagcgc aaagcacggt aagtatcagg gaggggtgta 180 
taggagggtt gcaaggcggg aggtgttcat atcaaggcag tgttcatgaa cccgtcttgc 24 0 

30 ctgggctcat gaacacgttc ggcttacgcg gtcagtgcat ttcctcgctc aaatggtcca 300 
gccctgccag catcaactca tgccggtgga tgtcgtccag gctggcgtag gaacccggtt 360 
tttcgttgac cgcgtgccac accacaaagt cgcgtcgtac gtccagaaac aggaagtagt 420 
gattgaaacg ctctgactcc ataaaacgtc gttgcagtgc atcacgcagt tgatcgggac 480 
gcaacgcgcg gccttctatg tgcaaggcga tcccccaatc atggtgttcg cgccgactga 540 

35 caaacgcgac gccattggcc actggccata ctgctgggct ctgggcggca acctgagcgt 600 
aaaatgccga cttttccgtt acctcaatca tttctaatcc tttaactgca cgacagtgta 660 
atcccgctca tggtcccggt cgtccagacc ttcgcgcatg tcgggcggcc accaaatgac 720 
cagctcgcgg ttgttggagt ccgggcgttt gcaagcgttc cccgcacagc cgtgggtggc 780 
acaccctgtc agcgtagcaa acagcaagag caagagcgtt aggctacgaa tcatcatggt 840 

40 ttcgctcccc ggagcagtga cggcctgctt tctttggcca ttttagatat ctgcggctgg 900 



- 10- 

cgcacagcga tgtacacctc actttcttca cccggctgca gccatgcatg aggccaggcc 960 
gcaacgccga tgacccagcg accgccgcat cggctttcgt cgatacgtac cggcttgtcc 1020 
gtgttgttac gcgcaaccac cacagcaaca ccccagtctt ttttgacgaa ccactgcgag 1080 
cgctgcccat caagcgtcag accttcgccc ggatcacaca gacttcgtgt ttcaaagggc 1140 
5 agggtctggc cagcgcgcag gccttccggg gcggggccgt cgatcatttg ggtaaagact 1200 
ttctggatgt cgccccgcgt tggcagtcgg cctccgtcac gtcgttcctt gattttcttc 1260 
atctggtcat cgacgtcatg ggggttgccg ttctgtacat agcgtgctgg attgacctga 1320 
tcgccgatca gtcgaggggt cagaatgaac agccgctcgc gctgactcag ttcgcgactg 13 80 
cgggactgga acagcagctt gccgatatag ggaatgtcgc ccaacagcgg gatcttgtga 1440 

10 atcctgtcat tggcttccag accgtggaag ccgccgatga ccagcgagcc gtgctcggca 1500 
atcaccgcct gggtgctgac attgcctcgg cgcacactgg gttgggtgtc attgatcgtc 1560 
gacacatcga tctggccatc ctcgatgtcc acgatcattt ggacctgagg cttgccatcg 1620 
ttgtccagcg aacgcggaat cacttgaagg ctggtgcccg ccgtgatggg cagaatgtca 1680 
gcggcccgct cggaagtggg cgtcaggtat tcggtgcgac tgaggtcgat cactgcaggc 1740 

15 tgattctcca gggtcaggat cgacgggttg gcgatgactg acgcagaacc attgccttca 1800 
agcgcatgca attcggcaga aaacttgctg gcgttctgca agaacaacgt tgaactggtg 1860 
ccgccatcaa acaggttggc acccacctcc gacgctgccg ggcattgaaa ttccagccga 1920 
ctggacagtt cagccagttc attggggtcg atgtcgagaa tgaccgcatc gatttcgatc 1980 
aggttgcgcg gaacgtccag ctccttgacc agtttctggt acatggcctt gcgctctggc 2040 

20 aggtcgtaaa tcaatacgga gttgttacgc acatcagcgc ttacgcggat attgccttgc 2100 
ctgaggcatg acccttggca gtttttttgc tgttgaagtt caatacgcgg tgcaatgccc 2160 
ctgttgcagt gctcccgtat cgataccatt ggagcccagg ttgtaaggca ggccggggcc 2220 
gcgacacctg tgctgttggc aacactgctg ccctgccccg ccaacaagtt cacgctgtca 2280 
atgctttcgc cacgcgaacg gctttccagc agctcttgaa gaatactggc gacaccggcc 2340 

25 accactaact gctggtcacg gtagcgaata gtccgatcag ccgcgttggc gtatttgagt 2400 
ggcagcacga caacatcttg cttgtcggcc ttctcgtcgg gcttttcgac tttcttgctg 2460 
tagtcgcgca caaactccac gtatttggcc ggaccacgaa ccagaaccac gccttcgtca 2520 
ggcagcgagc cccagcccaa acgcttgtca acaagaccga catcggtcag cgccgtttgc 2580 
aggtcgtcca ccgcatccgg cgagacttcg atgcgccccg aggtgtgctc gctggaaggg 2 64 0 

30 ctgacataca gcgtgtcgtt atagacgaac cactggaagt ggtattcctg actcagccgc 2700 
tcaagaaact cttcagggtt ctgagcacga atacgtccat cgaggtttcc ctggacaggc 2760 
gacatgtcga gcgacatacc gaactccctg gcaaagtcag ccagggcagt agacaactcg 2820 
gtctgccggg catcataggc gtaggcggtg tgtttccagg cttctggggt gaccgcccac 2880 
gtggcaggga tcaccccgat caacaataaa ggcaaccaca ttaaggcctt gcgcatttca 2940 

35 cactcccggt tgccggtgat tgaggatcga acgcccggac aaagtgggcg tcgtgttacg 3 000 
aatagtggtt tgcatcaggc tgagcatgcc cgcgcgctga ttggccaggc tttccagacg 3060 
atcgagcagg tcaccgaggc tgcaggggtt tgccatccag ctgaccagca ctacgcagcg 3120 
ggtctgcgga tcgatggcca gcgcgccgtc gcaggcacac gccaggcttg cgccgccctc 3180 
gccaagcaag gcttcgagcc gttgcgggtc accggcgtcg tacgggtcga gcagttcgat 3240 

40 actgcaacgc accccgtcgc cgacgaccgc cagccgagca ttggcgtcat cgatccagca 3300 
gtccagcggc atcgctggac gctgggcaga ccactggcca acgatctcgg tgaattcact 33 60 
gaattccatc gatgactgct ttattgatac cgtgcttggc acgcaggcat tcattgacgg 3420 
caataccggc gacatcgacc tgctgctggg acatcgtgaa tgcctgcagg tcttcgacgg 3480 
tgccactctc ggaggcttcc atcgctgcct ggtccatgtt ggtgtgagca cggctcaccg 3 540 

45 aattgtcgag atggcgttgc aagctgttga aactgatcat gtcctggtgc tccagcagaa 3 600 
gggttcaaac cttgagtgga gcaaacccgc cgagcggttc catcatgcga tcaagtgagt 3660 
gcagagagtg tgtatcaggc agcaggctcg acacccagca gccccttgcg caggtctgcc 3 720 
caagcgatat cgaacgcgcc attggcatcg ctcagacgca agctgtccga ggcgatcgtt 3 780 
gcatcgcgct tgagttgcca gtgctcggaa aaacggctgt ctgccagcca ctcagccacg 3840 

50 gggtcggcta tttgggggtg aacactgagc gtcgcgaccg cttcattgag ctggctggcg 3900 
gccaggtttc tggccagcgc ccgcgcacgt tcggccagcg tggtgtcgtc taacaagtgc 3960 
cgcagggatt cactcaacag ttcttctacg gcggtcattg cctgctcctg caacgcctcg 4 020 
cgctgcacct gaagctcgcc gagaaacgcg ttggcgtttt cccagaactg cgccagcgcc 4 080 
tgctgctgaa ggtgctcggc tttctcttgc tcaagggcca gtatctgcgt ggcctgctgc 4140 

55 cgcgcgtctg ccaggatgtc gcgcgccagc aggctgtcgg cgatgtcttc gcggcgcaag 4200 
atcggttcgc gcagcagcgt agcggccgtc agagcaatac tgcgtttggc gagcatgggc 4260 
gtattcctga tgcagagaag ctggttcgga ttcaggcagc cgtgacgcgc cacatgatgg 4320 
cctgccataa cgcctgaagt ttgttttcgg gtgccttgcc gggggtgtcg ggcacttcat 4380 
tgggcgggca ctccagacac agtcgcgacc agtattgcgg cccaagccag gcgcccagca 4440 

60 gaagacgcgc gtcctcgtgt tcaaactcca gccagacacc ggggcgcagc gctttggtca 4500 
acccccagca ccattgaccg tcaggtccgt cgctttcgtt acgggagaag cagatgcact 4560 
gcgccaggct tagcgcctgc tcacgctgcg agggcgtcag cgccaaccag cgcagcaccg 4 620 
gttccgcggg cgctggcggc tgagccgggt caatgcccag actctgcaga aacacgccat 4680 
gacggctggc catgagcgca tcgcagtcac tgaccgataa cccacgagcg ttggcgaatc 4 74 0 

65 ggtcatgcca ctccgaatgt gcccactgcc aggggttgca ccaccagtga atccagtgat 4800 



cctcggcaga 
cccgatccgt 
caacaccaac 
cgtgctgtcc 
agcaggcaca 
aatactgctg 
agagtgcttg 
ctcgggcagc 
ttcaagttcc 
caccccttcc 
agcgtcgagc 
ggttttctcc 
attggaatcc 
cagcagcatg 
aggttggtca 
atttgcaacg 
gtgttcgaca 
agccgctcgg 
tcagggcgca 
ccctgatgag 
atggtcacac 
ttattggtgc 
tgcaggtcgg 
aagccgctca 
ggcgtgccac 
gaaaccgact 
gcattgccat 
tcactcacgg 
tggccgagct 
tccaggccat 
tgggccttgt 
gaaaaggttg 
aggccgttca 
ccgccaccga 
ccgatgccgg 
gatgcagtga 
agcattttgc 
agctgatcga 
ctgttgagcg 
gcttgcataa 
taattagtaa 
gtcttctttg 
ctgggcctgc 
ttgtgcatca 
gttggaagca 
accaaggttg 
aagatttacc 
cagcgtcttt 
gtggcagtcc 
ctgaatcagg 
ttcgccctgc 
cttgagctca 
gcaaacatcg 
caggggaact 
caggcgaaaa 
ttgggcggag 
aagcgctcgg 
gatctcatcc 
atcagcgccg 
cggaatggcc 
ccgtcgggca 
cagttgcgaa 
atcctcaaac 
cacacctgcg 
tcatgcggtg 



aaggctcatc 
cgcaacagac 
gcaagaccga 
agcttgaagg 
aacacgatgg 
gcgaccatct 
atgaacaccg 
accacatgca 
tgggacaagg 
ttcttgaaaa 
acgcgcacgg 
agacgtttac 
tgctcggaca 
cacaacagca 
acttgtcgag 
agcactgcga 
ctttcttcat 
acagcgcact 
tcgccgcatt 
cattctgccc 
tggttctccg 
cttgcaacag 
cgccggtgtt 
gcagttgacc 
ctgtcggctg 
gcaaaccacg 
tggccgcggg 
gcgaacccag 
gttgaccaat 
tgtcttcctt 
tgtcgtccat 
ttccgccttc 
ggacctggct 
cacccgaacc 
cagaggcacc 
tgtcatcgat 
cgagcggtga 
tcacagcctt 
acacggggaa 
aacgcccatc 
ctgatacctt 
ccggcctgga 
atcgcgatca 
accgacaggc 
accgtgttga 
gtgagtttgg 
agcgtgattg 
gcaaaaaatc 
agttccaggc 
gctttttcat 
ggctcggcgc 
cggatattgc 
ggaacgggaa 
atgcgatcag 
tacagatcgc 
gcgatcacgc 
gtttccagca 
aggtacagcg 
gtgtaggcac 
gcgcaattca 
atcgtgtctt 
atactttcgg 
ctttcatcaa 
tcacagaccc 
atacagggtg 



atgcacgtgc 
tggcgcgcca 
caggtgccac 
gcccgaagct 
aaaacttttt 
gttgaatacg 
cagcagaagc 
ccctggccac 
cgtagatgta 
tctcccccag 
cgcggttcat 
gcgcatcgat 
agccagtgaa 
gcagccctgc 
cgcctgagcg 
caacgcccga 
ctggcgtaat 
ggctatccgg 
gaataggtcg 
aagctccggc 
tcaggcggct 
cgcattgatc 
gccagcatcc 
caggtcctga 
cgtggaattg 
gtcgatgagt 
acctgtgttg 
accgccgcca 
gacgtcgaga 
cagctcgttc 
gaactgggca 
accactcggt 
catcagatcg 
agaacccgcc 
gaaattgtcg 
gctgttagcc 
ggtttcatcg 
gagctctttg 
caatgatgca 
ccaaggtagc 
tagcgttcgt 
tggcgttgag 
gcttcgcgcc 
tgtcgccggt 
caccctgcaa 
aggttaatcc 
cttggtactc 
agatctgcaa 
ttaccgaatc 
caactcgcaa 
ccagcaaggg 
cgggccagtc 
caccgagctc 
actggttacg 
gacgaaactg 
agatatccag 
ccctcagcaa 
tgccgccctg 
cgctgaccac 
tcgccaccag 
tgcccgtgcc 
caactatccc 
gactcatccc 
cggacctcgc 
cgtcttggca 



cggcagcgtt 
gtcactgcgc 
ccagagcatc 
cacccattgc 
cgaatcgaca 
tccgcgcaca 
cggttgaaca 
aatgactccg 
acgggcacgc 
cgtggtgcgc 
ttcgctggtg 
atgctgatcg 
caaatcagtc 
gctcagaaaa 
ctcttgctca 
ctcatctgca 
gcttgctgtg 
tcggacaggt 
acatccgcct 
gatacacttt 
gtcagtcagg 
agctgagctg 
tgaagcgtcg 
ttggacacgt 
tcgaccggtg 
tgaccgatca 
gcatcgattg 
ctggtaacgc 
gccgaacgaa 
atccacgagc 
actttttcca 
gtcagcagat 
gattgcccgg 
ccgccaatgc 
ccgagctttt 
gacttgccat 
agctgcccac 
ctggaagtgc 
gaggtttgca 
ggccccctct 
cgctgtggca 
cacgtccatg 
gttggcgtcg 
gcccaaaaga 
tgcgccgccg 
tgcaaatgcg 
actaggtggc 
ttctttgatg 
caaacaattg 
ttgcgatttg 
gaaacccagc 
gtggcccagc 
cctcgcggcg 
tagcggagga 
cccccgctcg 
gttgatcgtc 
tttggcttgc 
cgccgcttcg 
gccgaataac 
gcgccctttg 
ggtctcaccc 
cagattcgga 
atgaccccca 
agagtatcgg 
actccaactc 



gaacgaccgc 
accagcagtg 
aggttccaga 
gtggtctctt 
gattgcgtgg 
ctgtcgggat 
ggttcgcccg 
tcgatctgcg 
tcttcaagcg 
gagcgccgag 
gcgacagtca 
gcgaggcgcg 
tcatcactgc 
ttcacggaaa 
cgaccttggt 
cgatgtctcc 
aaagcttctc 
gcgacgctgc 
gaacgggttc 
tcaaattgct 
ccacagcctg 
ccacttgcgc 
cttccagccc 
tgcccgtcgg 
taccaagacc 
gttgacctac 
caggattacc 
cactggcatc 
actgagcggt 
cgccgtcccg 
gggtcggcat 
cgtccagcac 
cacccgcgtc 
caccgccacc 
cgtggatcag 
ccgcagccat 
tttgggtcag 
tggtgttggc 
acgaactgat 
gatgaggggg 
ctgatcttct 
gtctgcttct 
gactctttac 
atgtttttct 
acaccgccaa 
accatgattt 
agcagcctgc 
cgtcgataga 
tcgtggcgct 
agcccacagg 
acatggcgtt 
agcactttgt 
gcggccgtaa 
agcttgagtg 
acggcgtcgt 
gacgtcgaac 
agggccagcg 
acataaccga 
tcgctctcgg 
cgggctgaca 
gatagcagca 
acccgctcct 
ggacatcaac 
cgctgcaact 
ctgaagcacc 



gactgccaaa 

caccgatcag 

acggcaagtt 

ggaactctgc 

acataccggg 

caagtgcagc 

gcgcgatgcg 

acagcgtggc 

gcgtcgaaat 

gcagacccgc 

cgacaacgcc 

ctacgacctc 

agccgccgag 

cctctactgc 

cgtcaacgcc 

aggatcttcg 

ggtactgccc 

tggcccgctg 

ggagccgagc 

gagttgggaa 

gttagtctgg 

agcgctcgat 

gcgttgacgc 

gttagccact 

accacccgac 

gtcgacgctg 

cagggagctg 

accttgttgc 

ttcctgtgca 

agtagggaac 

gtcatcactg 

ggctttgccg 

gctgctcaga 

gccacccgcg 

cttgtcgagc 

ggccttggcg 

cgcctgaacc 

gctcacatcg 

gctgttaagt 

caatcagaaa 

tgttggtaga 

tcattgtttc 

tggccttggc 

gaagagtggc 

cggcgctgtt 

gatgcccctt 

gatacggttc 

gcgtacgggc 

tgagcgactc 

ccaagtgctc 

tggctgcagc 

gcagcagtgg 

aacgtgtgaa 

tcaggacgtt 

ccagcgagca 

ccagccgttc 

gcatgctatc 

ctctggagcg 

cgagggactc 

tctcatgaat 

cgtcgatacc 

cgtccagatc 

gttggataac 

cccagttcct 

gcgtcgaaat 



4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

8580 

8640 

8700 
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tgtgcctgtg ccgcttcaag gcatcctgga tgagcatttt ctcgatgatg cgcatttgcg 8760 
tgcgcagccc cgtggcaggg tcaagcgctt ccacagggtc ggcgcccagc aaggggaagc 8820 
cgagtacgaa gcgcttggct gcagacttca attcgcggat gttgcccggc cagtcgtggc 8880 
tgagcagcag ctgcacacgc ccgctgtcca gcgcaggagc gggacgtccg aactcggcag 8940 
5 cgataccctg ggtgaactgg tcgaacaatg gcaggatctg ttcacgacgt ttgcgcaagg 9000 
ctggcaagtg aagcgtcagc acgttgagcc gaaaaaacag gtcgcgacgg aaaagtcctt 9060 
gttccaccag ttcatccagt ggccgctggg ccgaggcaat gatccgcaga tccaccggga 9120 
tgaattcggt cgagcccaga cgctcgatac ctcgactctc caacacacgc agcagtttgg 9180 
cctgcaggct caacggcatg ctgtcgattt catccaggta caaggtgcca ccactggagg 924 0 

10 cctctatgta gccctcgcga gcccggcata cgccggtgaa tgcaccgttg accacaccga 9300 
ataactggct ctctgccagc gactcgggaa tggcggcgca gttcatgccc acaaagggtc 9360 
ccgacctgct ggacaactcg tgaatgcggt tggccagtgt gtccttgccg gtgccggttt 9420 
ccccgcacaa cagcaagtcc atatccagaa acgcgctatt cattgcaatt tgatgacccg 9480 
ctgataatgc agttacgccc caacactctc ggacgtcctt atcgatgcct gtactcatcg 9540 

15 ttgcactctc atggtgggtg gcaagcggag tattaatacc acgtcttaca aggcagaaat 9600 
atattaattt agttccccgg gaaatgagaa aaagatcaca aagttgagaa ttactatcat 9660 
attaatatca ccataccaag acgaccctac cgatagactc aggctcttga gatgattgct 9720 
ttaatctatc gttactccaa tgcgaacaag cgcttacagc gtccatgcgc tggctcgccc 9780 
cgcaagccat agggcctctc cacacctcaa agcagctgtg atccgggaca agagcaggca 9840 

20 cctttgagca gcaagcgccc caaaatcgcg caatgaaacg caactaactt ctcgtcacta 9900 
ctcgagagaa acatataaga cttttccaaa acaactaaag gggtcacaag taaggaagca 9960 
gaagaaaacc gaacacacaa aacaagaaaa ccaaacggtt tttagcggcg agcttaaaga 10020 
agcgaacaac aataacacga gaaaacaaaa aacagcctga cactaactat ttgcacttta 10080 
gaacagtcga taccaaccag cttagttccg ccccacgagc agtcggattt ccgaacaaca 1014 0 

25 cagaggcttg gatactggca aagcggtcat agccccggtt tttcggcacc actcagtact 10200 
ggcatttagt catcatcgca ttcggcaatc cgaacaaaag cccacctgct tagactattt 10260 
ccaggcacag ccatctaagg aatcgcggaa aggattcagc gtagcttaat accggaaccg 10320 
caggtttagg ttctgtgaac caggcggtta atacgatcga tgatcgcgtg ccatcaccta 10380 
gaatgtttct aaatgtgtgt aatctttcac ttacattcgg ctaaaaaagt tcatcaaaat 10440 

30 aatcatatgt agcgctctac atcatatggc taagcgccat ctttagggtc caaaaaacgg 10500 
gtaacgctca ataaaagaag ttgtattgag gcagatcaat attgtccgac aacgagaaaa 10560 
agcaccaaaa aagtgcgctt ttcaggggtt ttcaatagaa caatcgagta aaaccggggt 10620 
tattggcgtg gatcactggc aaaaaccacg acgcgcggcc ccgtaggcag ctcgcgcgga 10680 
ccgctgcgat actcgtcgtc atcacgcttg cgaggcgacg aacggtcatc cctgatgcgg 10740 

35 ggcaactgta tccggtttgt aagcggatca ggttccacaa caggtgcgga ttgggcgatc 10800 
tctaccgccg gcgctgattc agctgcagga gctggctgta acgcctcagg cgcagtgggc 10860 
tgctgagcca ccggcaacgg ctgagccgtt ttgggcgaag gcaggttctc ggctaactgg 10920 
gccgactgca cgggcttggg cagcggcgga cgctctgcaa cgcgcactgg acgctcagcc 10980 
acaggcgcgg gcgcgggcag acgctcagcc gcccgtttca caatggctga aggggtgacc 1104 0 

40 agcgggatgc tggcagtcac cggggactca ccggtaatgc gcgcgatgct ggtcgtgagc 11100 
acgcgattct gggttttagg tatcagcaga cgtcccggtc catcgaaggt ctttttgcgc 11160 
aggaatgccg agttcagccg caacaactgg ccctcatcca cacccgccgt ggccgcgagc 11220 
tgggtcaggt ctacggcatg gttaagctcg actacgtcaa aatacggcgt gttggcgacc 11280 
ggggtcagtt tcacaccgta ggcattgggg ttgcgcacaa ccattgagag cgccaacagt 11340 

45 ctgggcacgt aatcctgggt ttccttgggt aaattcagat tccagtagtc cacaggcaga 114 00 
ccacgccgtc ggttggcctc aatcgcccga ccgacggtgc cctcccccgc gttataggcg 11460 
gccagcgcca gcagccagtc attattgaac tgatcatgca agcgggtcag gtaatccatc 11520 
gccgccttgc tggaggccac cacgtcacgg cgagcgtcgt aggtcgcgct ttgatgcaga 11580 
ttgaagctgc gccccgtgga tggaatgaat tgccacaaac ctgccgcagc ggccggagag 11640 

50 ttggccatgg ggttataaga gctttcgatc atcggcagca gtgccagctc cagcggcatg 11700 
ttgcgctcgt ccaggcgctc gacaataaaa tgcagataag ggctggcccg gacactggct 11760 
cccgtgataa atccgcgatt gctcagcaac cagtcgcgct ggcgagcgat acgctcattc 11820 
atgccttggc catcgaccag cctgcagcgc tgggcaaccc gctgccacac gtcctcgccg 11880 
ttataaacag gcagatcgga gattttgtct gcagcccgcg aaccttcctt atcatctccc 1194 0 

55 ccccaataga ccagccccga caccagccgc ggcggacggt cctgacgcgg cggcgaatag 12000 
tccacagact ggcagcccac acacaaggcg cccatagcga ggactgcgat ttgaacagcg 1206 0 
cgagccagca agcgtgggct cgatacgggg aaggcgacgg cgggcatggg cgggaatgtc 1212 0 
ctgagcgtgt ccaccctacg tggcacgctc gccgttacgg ttcccttttg aaaccgagat 12180 
cggcgcacac aacgcattgc tgaatccttt cagccgtaag tttttccgat ggaacccgct 1224 0 

60 ggcattgcat gccactcatc ctgtgaagga attttcacgt ttggtatcag gcggctatca 12300 
gcgataaaat ggacagagag attcaccgtg cagtcaccat cgatccaccg gaacaccgga 12360 
agcatcattc agccaaccgt cacccctgac gcacgtgctg caactgacct gcaggaaaga 12420 
gccgaacaac ccaggcaacg ctcttcgcac tcgttgagca gtgtcggcaa gcgggcgctg 12480 
aaaagcgtcg gtaaattgtt ccagaaatcc aaagcgccgc agcagaaagc tgccacgccg 12540 

65 cccaccgcga aaaacgtcaa gacgcccccg cctgcttcaa atgtggctac gcccagaaac 12600 
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aaagcccgcg 
tggattctgc 
cacccggagg 
cttgagcgca 
gcaaatgcca 
cgcatttcaa 
gattcggacg 
ttcgagctga 
gatgccaagg 
gattccattc 
cacgggcacc 
tcgctggccg 
gccgtgaaaa 
acccaagagc 
gacggcgagc 
cgctggaaaa 
aacggctcgg 
ccgcacgtgg 
ttgctcagcg 
gggctgacgc 
gcggtcggtt 
gcggaccgta 
aactttcagc 
ggggacgacg 
gctttagacg 
ctgaacaaca 
ctcgatcgtg 
ccagaatgct 
agcaatgctt 
cccaacatgg 
gaaatgggca 
agcaacaaac 
cacaaacccg 
gacgaaaaac 
gaagcctggc 
ctgcccggag 
cagatcgaag 
ttcgaacagc 
tcaaacaaga 
ttcgggcgca 
aacatctaca 
cagcatcgct 
ttcaagcaac 
ctgaaagcgc 
aaggaactgg 
atcggtcaga 
catggcgagc 
ggcacaaagc 
ttgactcaag 
catcaagggc 
gccagcgagg 
agccttggcg 
ttacaaaaaa 
gtcacagaca 
acattcctca 
acaggcagca 
gagcatggcg 
ttcatcattc 
cgtaactaca 
gaaggtgcgg 
tttgacgcaa 
aactttcgcc 
gtcttcaatg 
ttgaatccat 
ttcaacttcg 



aatccggttt 
gtaaccaccc 
cagccccccg 
gcccgtcgta 
ctcgccaatt 
tgctggccac 
gaccgattcc 
aagacgaaaa 
gaaagcctga 
ttgccacacc 
agttgctaca 
tgatccgtag 
tggagcgtga 
tcccaggcaa 
gtatgcgtgt 
taccggaagg 
tttatgcaaa 
aagtcgaaga 
gcaaaacgac 
cgaaaaaaac 
tgagtggcga 
gcgcattcga 
tggaaggcgt 
gcggtgttca 
agcaaagctc 
atcgcggcct 
cgggcctggt 
ggaaagacgc 
atgtactcaa 
cttttgaccg 
aagagatcga 
gcttcgtcgc 
tcacactcga 
acaacctgca 
aatcgacaaa 
ggcagccggt 
acgccgaggg 
gcccggtaga 
cctggcgaat 
gcggtgtgga 
aaaacaccgc 
accagggtcg 
tggagctgat 
gcatcaccgc 
aaaccctgcg 
gctatggcaa 
tggccaagcc 
tcaacttcaa 
tggctccgtc 
tgaaactcag 
atcatggcct 
cgctgctcga 
agctggcgac 
tgggctttac 
agtcgttcaa 
aggaccaggc 
acgacgaagt 
ttgccgacaa 
tactcaatgc 
gaaacgtgag 
ataatcctgc 
tgggcgtgga 
ttccggatga 
tgcaggtgct 
acctcacggc 



ttccaacagc 

caaccaggcg 

taaaaacctg 

cctcgattca 

ccggtcacct 

agatcctgat 

gccgcgcgag 

actggttcgc 

cttctccacg 

caagcaaacc 

ggccaacggg 

cagcaacgaa 

agacggcaac 

ggcacacatc 

gcatgaggac 

cctggaggat 

aagtgacgat 

cctgcagtca 

ccaggcgatc 

caaaggcctt 

caagctgttt 

gggcgatgac 

gcccctcgga 

cgcgctgatc 

aaaactgcaa 

gaccatgccc 

tggcctgagt 

aggcataaaa 

gggcggcaag 

caacacagca 

aggcctcgac 

cctcgatgac 

cattcccggg 

cgccctcacc 

gctgggggac 

aaaggcactt 

caagggtctt 

agaaaacggt 

tccaaaaacc 

gaaatccaaa 

agaaacgccc 

cctgggtctg 

ccatgagtcc 

actggaagca 

cgacgagctg 

ggcgaaaaac 

gtcggtgcgc 

aagctctgga 

tgctgaaaac 

ccaccagaaa 

gagcaaagcg 

ccaggtcgaa 

gctgcgtgat 

cgataacaaa 

aaaagcggac 

cgagctggcc 

cgggctgcag 

ggctacaggg 

cgagcgttgc 

cggcggtttc 

acgcagtgtt 

cgtgaccgcc 

agacatcgac 

gaaaaaagca 

aggtggaact 



agcccgcaaa 
agcagctcgg 
cgcgtaaggt 
gacaacccga 
gacagtcacc 
cagcccagca 
cccatgctgt 
aactcagagc 
ttcaatacgc 
tacctggccc 
cactttctgc 
gcactcctta 
attcacatcg 
gctcacatta 
cgtctctatc 
accgctttca 
gccgtggtcg 
ttttcagtcg 
ctactgactg 
gagctcgacg 
atcgctgaca 
ccgaaattga 
ggccacaacc 
aaaaaccgtc 
agcggctgga 
ccgccaccca 
gaaggacgca 
gatatcgatc 
ctgcacgcac 
ctggcccaga 
gaccgagtga 
cagaacaagc 
ctggaaggcg 
agtaccggcg 
cagttgcgag 
ttcaccaacg 
atgcagctca 
ttgaatgatg 
gggctgacgc 
aaagccagca 
cgctggatga 
aaagaggttt 

gggggaaggc 

aaactggggc 
gaaaatcaca 
cttaaacagc 
atgcagtttg 
catgacttgg 
cccaccaaaa 
gccgacatac 
cgcctggcgc 
cagctaccgc 
gtgacttacg 
gcgctggaaa 
catgccgtca 
ggaaaattca 
cgcagctacg 
ctctggccaa 
gagggcggcg 
ggtgccggca 
gatgtcggca 
accgtcgccg 
gcattcgtcg 
gtggaccatg 
gccgatatac 



atacccatag 

gcgcgcagac 

ttgatctgcc 

tgaccgatga 

tgcagggctc 

gctccggcag 

ggcgcagcaa 

cacaaggcag 

ccggcctggc 

accaaagcaa 

acctggcgca 

tagaaggaaa 

acaccgccag 

ccaatgtgct 

agttcgaccc 

acagcctgtc 

acttgtcgag 

cgccggacaa 

acatgagccc 

gcggcaaggc 

ctcagggcag 

agctgatgcc 

gcgtcaccgg 

agggcgagac 

acctgaccaa 

ccgccgctga 

ttcaacgctg 

gcctgcaacg 

tcaagattgc 

ccgcacgctc 

tcaaagcctt 

tgaccgccca 

atatcaagag 

ggctttactg 

cccgctggac 

acgacaacgt 

aggcaggcca 

tgcactcgcg 

tcagaatgga 

ccagcgagtt 

agaacgtagg 

atgaaaccga 

ctccggcacg 

ctcaaggcgc 

gctacaccgc 

aggacggcat 

gcaagaagct 

tcaaggagct 

agttgctcgg 

ctttgggaca 

tggatctggt 

cgcaaagcga 

gcgaaaaccc 

gcggttacga 

gcgtcaatat 

aaagcatgct 

gagtgaacct 

cggcaggtgc 

ttacgctgta 

aagactactg 

acaaccgcac 

ccagccagcg 

acgacctgtt 

agagctacga 

gcgccggaat 



ggcacccaag 

gcatgaaata 

gcaagaccgc 

agaagcggtc 

tgacggtacg 

caaaatcggt 

cggaggccgt 

cattcagctg 

tccattgctc 

agacggcgtg 

agacgacagc 

gaaaccaccg 

cggccgcaaa 

tctcagtcac 

gataagcact 

cactggcggc 

cccgttcatg 

cagagcagcg 

ggtgattggc 

gcaggcggcg 

actttacagt 

cgagcaggca 

attcatcaac 

tcactcccac 

tgcgctggta 

ccggctcaac 

ggacgcaacg 

cggcgccgac 

ggccgaacac 

gacaaaagtc 

tgcaatggtc 

cagtaaggat 

cctgtcgctg 

cctgcccaag 

gccggttgcg 

gctcagcgcc 

atggcaaagg 

catcacaggt 

cgtcaataca 

catccgcgcc 

tgaccatatt 

gtcgatgctg 

gggtcaagac 

tacgctggtc 

gctgatgtcg 

tctcaaccag 

tgctgatctg 

gcaggatgcc 

cacgctgaag 

gcgccgcgat 

cacactgaaa 

catagagccg 

ggtcaaggtg 

atcggtcaag 

gcgcgcagcc 

caagcaactg 

caccaccccg 

caccggtaac 

cctcattagc 

gccgggcttt 

actgaccccc 

cgccggggtg 

tgaaggtcag 

ggctcggcga 

aaacctgacc 



12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
13140 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 
14220 
14280 
14340 
14400 
14460 
14520 
14580 
14640 
14700 
14760 
14820 
14880 
14940 
15000 
15060 
15120 
15180 
15240 
15300 
15360 
15420 
15480 
15540 
15600 
•15660 
15720 
15780 
15840 
15900 
15960 
16020 
16080 
16140 
16200 
16260 
16320 
16380 
16440 
16500 
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gaagaccgag 
ggattcgctg 
aaaaacgaca 
gtgacggccg 
acacccgcct 
ggagcgctca 
gtcgccaagc 
ttcctggaca 
tacacaggca 
tttgccgaca 
aaacgcgcgg 
cgctttgaaa 
ataatgagtt 
atgcgccagg 
ctggcacgcg 
ctcaacggca 
cgcatcaagc 
acaccgttgg 
atcaacttcg 
ttgtcacgac 
gaactgaaga 
tattcaagcc 
gtcacaaaac 
tccctgatcg 
cagacgcagc 
catctcactg 
agcgcatacg 
gaagcgggcg 
catgaagcat 
gtcaagcaag 
cgcctcgcgt 
cacggccatc 
gactcgtggg 
cggatgacgg 
atgcctgccg 
gacagatagg 
ctgagcgtgc 
gcagaactct 
cgcaggtcca 
aataccggaa 
tgcgaactgg 
tgcccattga 
gcactgtaag 
ttcagcagcg 
tttttgacca 
gtatccacag 
gcggcctgaa 
atcccgatta 
cgcagcttgt 
tgcggaaccg 
gcagtgacga 
cgcttggcct 
tcgaccagcg 
agagtgagat 
tcgccgattg 
agcgtcgcct 
agttgaaagt 
acctcggctt 
tccgagccct 
tcaaaaggca 
atcatcggcg 
tgtgtgtgtc 
tgtgaccctg 
tgagctgtct 
ccggccccgc 



acccgaatgc 
cgaacatcac 
agaccgaact 
gcgggcagct 
ccgccccagg 
atttcagtgt 
cgataacgac 
acacgaccaa 
agaaaccgga 
taccaccgcc 
cggtcgagca 
ccagcaaaac 
ccgtgcgcga 
acccgaaact 
tacggctgga 
ccatgactca 
gtctggtggt 
tcagctataa 
tttatggcgc 
catcggcatc 
gctaataacg 
tgtaaaaaag 
gctggctcat 
accgcaaacc 
aaatcggggg 
tgctcgggca 
ccgttctgca 
aaatcgggtt 
gccggacagc 
ccctcttcaa 
gctgacgcga 
tgggtaggca 
ataccgctca 
atgttattca 
ctttcaacgc 
acaccgaacc 
cacttgcgcc 
gaatacccga 
tacctttgag 
ttttgcgccc 
cagacaccgc 
tctggtgaac 
ccatcagctc 
cgcgttcact 
gctcaccgtg 
ccttcattgg 
acatcggctt 
tcgaagcctt 
cgctggtgga 
cagccgcgac 
gatcgctgac 
ccggcgactt 
cctgcctgtc 
ccagcgcact 
ccttgcttgc 
tgtccgcttt 
gctccagttg 
tatccggtat 
gaaactgctt 
aaggctcggc 
cgccgctgac 
gagccgcgac 
actgatcggg 

gggcgggaac 

cgatccgcga 



cgaccccaac 
cgttaacctg 
gaaggaaggc 
tcgcgctcag 
ccccactccc 
ggaaaacagg 
tgaaggtctg 
agcaaaactg 
tgaggttatt 
caaagacaac 
tcgggcatca 
caacctctcc 
cgcgagcgcc 
tcgcgccatg 
accgaaggac 
aagcgacctc 
attccacacc 
cagtggagcg 
agaccaggac 
gctcaaggaa 
aaaacagtaa 
cacgcgcttc 
cgagtgaggc 
agcagccatg 
ctcgttccgg 
attcaatgat 
aggtcagtga 
gcgaagtttt 
aggcgcctgc 
gtgccctcaa 
ctgcgttcaa 
actgcaatgc 
acgtgctctt 
aagcgtctcg 
cttggcttct 
cgtcgctgcc 
ttcagcctga 
aagagccttg 
caggtccttt 
ttgcgggtcg 
cccgccaaaa 
atcgttgagc 
acctaccgga 
tttcacgaac 
ttcgcttttc 
caccatgttc 
gaccacgctg 
gagcatgttg 
caaacgcaca 
aatcggccct 
gacgttgtcc 
gacgaaatcg 
agcgtgcaga 
gatgtgctca 
acgaccggcg 
tgcatgctgg 
atcagcgacc 
ctgacccggc 
cagttgatag 
ctgcagcaga 
cggagccgtc 
attcagccgc 
agtcagcggc 
agtatcgtgc 
actgatcatt 



agcgattcgt 
atgacctaca 
ggtaaaaacc 
atcggcggca 
gcatcacaaa 
acggtcaaac 
agcaaattgt 
gcggagctgg 
caggcgcaac 
gacaagcagt 
gccaacaagc 
ggcctgtcca 
ccgggcaatg 
ctcaaggaga 
tcactggtcg 
tccagcatgc 
gcgacccagg 
aatgtgagcg 
aagccgattg 
gcggctggcg 
aaaaagcgcc 
acgtgcctgg 
cagttcacgc 
caagcgcgct 
gcagcggcca 
cgccgcttcg 
cgtgccgagc 
catcgtcata 
agcctgtgtc 
tgcgtcatcg 
cacaccttca 
gcctcgtccc 
ctggaacgta 
gtacggtcca 
gcggtaaccg 
agggccatgt 
gcggtcacag 
ctgtagaacg 
ttcagatcgc 
acataattcg 
ccggatgcca 
atctggcgca 
tgggtggacg 
gccttgtcct 
agctcgaagg 
aggcgttcgt 
ttgaccgtct 
gcgtcgctgc 
taacccaagt 
gcacctttcc 
agttgcgtat 
gcgtgcaaac 
gactccttgt 
tccagcgacg 
tattcgccaa 
cctaccgttg 
gactgagcaa 
tgggcgaatt 
cgctcaggag 
ctaccgatca 
ccatgctcag 
gccgcgccgg 
ggattcatgc 
tgctggttta 
ggaatctccc 



tttctgcggt 
ccgattattc 
gcccgcgctt 
gccacacggc 
cagccgccaa 
ggatcaagtt 
cgaagggcct 
ccgaccctct 
tcgacgggct 
acaaggcatt 
acagcgtgat 
gtgaaagcat 
cgacaagagt 
tggagggcag 
acaagatcga 
tggaggatcg 
ctgaaaactt 
tcactaaaac 
gttacacctt 
acttgaagaa 
gcattgaagt 
gaaatgaacc 
tgcgcgcata 
acgtcgaagt 
atgcggcaat 
ttgttctgac 
tgggcgccca 
gtcctttaag 
cggcgccggg 
tcttttgtcg 
tccacgaccc 
atgtgatagg 
tgtggcagag 
gcataggtgt 
actggttggt 
tgcgcaaaat 
gcggcagtgc 
tggtgcgtac 
tctcggcgcg 
acttcaattg 
gagctcttgc 
cagcctgaga 
aaccctgaac 
gagcgacttc 
ggtcaggaat 
tgaggccagt 
cgtgagcaat 
tggtctcggg 
gtgtcattga 
agccacccac 
gtgcggcgac 
ctaccagggt 
tgccctgttc 
cgatgctgtt 
gggcagtctg 
cgggcgaagc 
aacccttgat 
tttccagccg 
acaatttctc 
acaacgcagc 
ccttgaaggc 
cagacgagct 
ctgcagtgac 
cccggctgag 
aggagccgaa 



agtgcgcggc 
gttgacccag 
tttgaataac 
ccccacaggc 
caacttgggc 
tcgttacaac 
tggggaagcg 
gaatgcacgc 
tgaagaactg 
gcgcgacttg 
ggacaacgca 
acttaccaaa 
tgccgaattc 
tatcgggacg 
tgaaggcagc 
caacgagatg 
cacctcacca 
actggggcgc 
cgacggcgaa 
agaggggttc 
ggcgcttttt 
cgcgcgtcac 
gacggacatc 
tcagactcaa 
gaaagatgac 
cgtcataaag 
gagaattgat 
gttaaaacag 
attaacgcgg 
gctgcttaag 
gaaccgtatc 
cgttttccgc 
actccctgtt 
tgcaccgccc 
gtacaacgtg 
agcccccgca 
cgaggtcagt 
cgacggctcg 
gtccggggta 
cagcagcgtt 
actcagcgtc 
accaccgaag 
cttcttctgg 
ctcgggcgtt 
aaccgtattg 
cttctgcaag 
gcccgccacc 
aatcgtgtct 
agacaagaac 
cgtgttacgg 
cgaagcaagg 
ggttttggcg 
ggcatcttgc 
gctcaggcct 
actgacggca 
gtcatgcatc 
cagttgcccg 
ctgctgcaag 
ggccatgact 
acgcgaactg 
ctgcaaaagc 
ttctgtcgcg 
tgcatttggg 
tttgacgcca 
aggctctcgc 



16560 
16620 
16680 
16740 
16800 
16860 
16920 
16980 
17040 
17100 
17160 
17220 
17280 
17340 
17400 
17460 
17520 
17580 
17640 
17700 
17760 
17820 
17880 
17940 
18000 
18060 
18120 
18180 
18240 
18300 
18360 
18420 
18480 
18540 
18600 
18660 
18720 
18780 
18840 
18900 
18960 
19020 
19080 
19140 
19200 
19260 
19320 
19380 
19440 
19500 
19560 
19620 
19680 
19740 
19800 
19860 
19920 
19980 
20040 
20100 
20160 
20220 
20280 
20340 
20400 



- 15- 



gtttggctgc 
tgaatccatg 
cgccttcggc 
gcgcattaaa 
tcaacgccag 
tcagaacggc 
ttttgaaatt 
tttcgttgat 
tgcggacgaa 
tcagttcctt 
atgccggcaa 
ccaaacatcc 
ggagctggcg 
acaccccggc 
cctcaaccaa 
ttgttcggca 
aatccgcagg 
gcattgatca 
caggaacagc 
gccgatagcg 
agcgcaacag 
ggcggcggca 
ggtggcgagg 
tcaggtactg 
gtggtgaaag 
ttcactgccg 
gagctggctg 
atccacgtga 
gtcggtgaag 
atcaagaaca 
cacttgaaaa 
ggcaagcagt 
ttcgccctgg 
accgacgtca 
atccagacaa 
tttagctaca 
gaagccgccg 
ttgaccgtat 
acggtcttcg 
gatttcaccg 
gcggccgtgg 
accgcggcct 
acccctgcaa 
aggcttctta 
ttgccagcgt 
agccagtcgt 
aatgcaccgc 
gcgttggcag 
ggtctggacg 
tgaaccgcaa 
gaagcggcca 
gtcgcgcggt 
atgagcgggg 
tgcatgtact 
gtcgtcgcca 
gtttcccctt 
ttgtgtacgt 
gcgaacacat 
ctgacagatc 
gtcgcagtca 
tctacggaag 
gaaagcaata 
ccgccggttt 
gaataatctg 
gcagattacg 



tggggcaaca 

ctcgcgccac 
agaggctcgt 
ggaaaatgcc 
caacgcgctc 
acggccttcg 
ggcagattca 
aagggtgtgg 
taccagtctt 
gcgttggttg 
aacgggaacc 
acatccctat 
tcggtccaat 
cgcaacagac 
acacgttcgg 
gcgacacaca 
acgccagcaa 
tgtcgttgct 
ctgatagcca 
ggggcggcgg 
gcggtggcgg 
cacccactgc 
gtggcgtaac 
gctcggtgtc 
acaccatcaa 
acaaatctat 
aaggcgctac 
aagccaaaaa 
acctgattac 
gcagtgccaa 
tcgacaactt 
ttgatgacat 
tgaaaagcga 
aacacgccta 
gtagcttgaa 
gctcacagat 
tgcccccctc 
tgcgcaagct 
ccagtttgac 
tgtcctgtat 
tccagcctgc 
tggtcgccgg 
agccacccgc 
ccgcacccat 
tgagcgccgc 
tttcttcgct 
cacgctggtg 
ccagaccacc 
ccagtgccgg 
cccccgtgtc 
tcgcatcgtg 
ccatcatctt 
tcagcggttt 
gaagcaacga 
atcggtcgag 
cgaagtgcag 
caactgcagc 
gatctgtcag 
gcacagagct 
caggcggttg 
tttgaacagc 
cagtgaactg 
aaaaggatcg 
cgtacgccca 
caaattgaaa 



ggttggtccg 
tctttggcca 
tcaagggcca 
gggctgtggg 
tcacggccgc 
tcgcggtcct 
tagaaacgtt 
tactggtcat 
cctgctggcg 
ggcatataaa 
ggtcgctgcg 
cgaacggaca 
tgcccactta 
caccacgcca 
cgagcagaac 
gaaagacgtc 
gcccaacgac 
gcagatgctc 
ggctcctttc 
tacaccggat 
cggtgatact 
aacaggtggc 
accgcaaatc 
ggacaccgca 
ggtcggcgct 
gggtaacgga 
gttgaagaat 
cgctcaggaa 
ggtcaaaggc 
aggtgcagac 
caaggccgac 
gagcatcgag 
cagtgacgat 
cgataaaacc 
aaaagggggt 
tgcttacgac 
ttctatatca 
ggcgccggta 
ggtctggtcg 
gaacgactcg 
gaaaacggct 
gtcggtgata 
cagggccaga 
tgcgtcggtc 
acccgagtag 
cagttgagcc 
atcacgcgac 
cgccatcgat 
agccaatacg 
cagaacctgt 
gagcctgtcc 
ggtgcccacc 
gagcggagcc 
ggccatggca 
cttttccgcc 
gcggctggcg 
ttggccatca 
gtaatcggca 
ggaggcaaga 
ttggacgcgt 
gcagtgctga 
tcgatcaaac 
acgaaggctg 
ctaccaagga 
ttaagcgagc 



tcgaggagcc 
ggtcggaaaa 
cagagcccat 
cgcccgcgaa 
gcgcgggcaa 
gaaactgcag 
caggtgtgga 
tgttggtcat 
tgtgcacact 
aaaaggaact 
ctttgccact 
gcgatacggc 
gcgaggtaac 
ctcgattttt 
actcagcaag 
aacttcggca 
agccagtcca 
accaactcca 
cagaacaacg 
gcgacaggtg 
ccgaccgcaa 
ggcagcggtg 
actccgcagt 
ggttctaccg 
ggcgaagtct 
gaccagggcg 
gtgaacctgg 
gtcaccattg 
gagggaggcg 
gacaaggttg 
gatttcggca 
ctgaacggca 
ctgaagctgg 
caggcatcga 
ggactcgtcg 
cgcataggcc 
gcttcacgag 
tgggtgatcg 
gctacgtagc 
gcttttttca 
gccgaacctg 
tttttcgtcg 
ccgttttggg 
gccatatcca 
ctggccgatt 
ttgggctctt 
tgcacactga 
acaccaaggt 
gtacgtacgg 
cgagcaaggc 
ggcgaggcgc 
tggtccatgg 
ggcagccaat 
aagggcgtcg 
ttggcgaagg 
cgcgtctcga 
gccgaatcgg 
atcgcattta 
gacgcgtcgg 
cggttgatgt 
agcgggcgtg 
agcgccagaa 
tgtggtcccg 
ctgcgccgaa 
tttaaggatg 



tgcagttgtg 
cgacttcatc 
cagcagcaca 
catgtgaaag 
cgcgcccatg 
ggtgaagtcc 
aatcaggctg 
ttcaaggcct 
gagtcgcagg 
tttaaaaaca 
cacttcgagc 
cacttgctct 
gcagcatgag 
cggcgctaag 
cgatcgaccc 
cgcccgacag 
acatcgctaa 
ataaaaagca 
gcgggctcgg 
gcggcggcgg 
caggcggtgg 
gcacacccac 
tggccaaccc 
agcaagccgg 
ttgacggcca 
aaaatcagaa 
gtgagaacga 
acaacgtgca 
cagcggtcac 
tccagctcaa 
cgatggttcg 
tcgaagctaa 
caacgggcaa 
cccaacacac 
agtccacccc 
gaaacggtat 
ccgggcgttg 
cctccccgcc 
ctgtggtact 
ccgcgggatc 
ccaggttggt 
ccatctcctg 
tcaggctgga 
gtggcagacc 
tgattgcttt 
tatccttcaa 
gcaggcggtt 
ccacagcacc 
cgttgcgcgc 
ttggcgagtg 
tcaggtaatg 
cgcccgacag 
cgcccttgtt 
cccgcaacgc 
tgtcggcgat 
tcagcgcagt 
ccggcggcag 
tctcgcgttg 
acgctgtccg 
gcatggaaat 
tccggagcga 
acagcgaaac 
gatcggttga 
aaatcaccgt 
gcagcgtaag 



gcctgcccca 
aacaacagca 
cgaccggtct 
ttgatgtcca 
tcaccgtaga 
acttcgctga 
agtgcgcaga 
ctgagtgcgg 
cataggcatt 
gtgcaatgag 
aagctcaacc 
ggtaaaccct 
catcggcatc 
cggcaagagt 
gagtgcactg 
caccgtccag 
attgatcagt 
ggacaccaat 
tacaccgtcg 
tgatacgcca 
cggcagcggt 
tgcaacaggc 
taaccgtacc 
caagatcaat 
cggcgcaacc 
gcccatgttc 
ggtcgatggc 
tgcccagaac 
taatctgaac 
cgccaacact 
caccaacggt 
ccacggcaag 
catcgccatg 
cgagctttga 
ctttttactg 
ttcacttgga 
acgcaggtta 
catgtctttg 
ggatgcagtc 
ggttgtcagc 
caactgactg 
caacttgcct 
cgctgacacc 
ggccatccgc 
ataagcctcg 
accgagcact 
gccaaagcct 
ctgcacggcg 
cgagtacgtc 
gcgcttcacc 
cagatcaccc 
cgctccggaa 
gatcgcaggc 
gcctgatgta 
ggttgccggg 
gatctgcgca 
tttatgcgca 
ctgatcggag 
aaagctatcc 
tccctctcgt 
ctacttgcgt 
gtccggtcgt 
cggttccact 
cgtttgtgtt 
ttcacaacat 



20460 
20520 
20580 
20640 
20700 
20760 
20820 
20880 
20940 
21000 
21060 
21120 
21180 
21240 
21300 
21360 
21420 
21480 
21540 
21600 
21660 
21720 
21780 
21840 
21900 
21960 
22020 
22080 
22140 
22200 
22260 
22320 
22380 
22440 
22500 
22560 
22620 
22680 
22740 
22800 
22860 
22920 
22980 
23040 
23100 
23160 
23220 
23280 
23340 
23400 
23460 
23520 
23580 
23640 
23700 
23760 
23820 
23880 
23940 
24000 
24060 
24120 
24180 
24240 
24300 
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ggcttggcgc 
tggtcctttc 
gctgttctgg 
caaaggcatc 
gatcagcttt 
cgcaatggtc 
acgggatgac 
gcgtgccctg 
gtcgcccgac 
tggctctgct 
gacccgcctg 
cgccaacacg 
ctgcatcctg 
cacggtggta 
gttcggcaac 
cctgcaatcg 
cgtaccgtgg 
aggggaaggc 
tgcagacgtt 
agaaaatggc 
aaatgcagac 
agaagcatct 
agagcaaccg 
aacttgggaa 
tgctgctcga 
tccagtttgt 
actgatagcc 
gctctatcat 
tgacagccgt 
ctgttgtccg 
gaggcgatag 
acctgctcgg 
aaaaagacct 
cgggcgaacg 
tgcgtcttag 
ctgtcaggtc 
ctgggcggtt 
gacggccaat 
cattgagtgc 
gccacctgtg 
ctgcggcatc 
caaccagagc 
cgcatcgatg 
catggaaatt 
ttccctttag 
cgccgatggc 
gggttgaatt 
ctggttatat 
cgttgcatct 
agttggccca 
tacgacctca 
agtggaggat 
ccgacgcttt 
gcatgcacct 
gtatcagtga 
accagagtct 
ctgcccatat 
ccgtccagaa 
ctccactcct 
aggatcagaa 
tgacctcatc 
tatgcaacag 
tgatatcatt 
agcaactgtt 
ccagctacgt 



ttagcgagta 
gagaaaaaat 
cttctgctct 
gacttccccc 
cgcaactcga 
aacacttcac 
ctcaacaacc 
cgcgcgcacc 
gagattcagc 
gcggttatct 
gaatcgacca 
ccactgccct 
atgccgctga 
ggctgcatgc 
agtcagcacc 
atgttctctt 
cgcgtggcca 
gcgaggctta 
gctccgtgcc 
tcatgttgct 
atccctgact 
ggtctttacc 
catagtgcgc 
gtgtgtccag 
tatgctggaa 
gaaagaaggc 
ccagacaagc 
cgatagtttt 
gctcttgggc 
ccatagcctt 
agaccatcag 
gactgggaag 
cttcatgccc 
catccgacga 
cggcaacccc 
atgaacgttc 
tttccggctt 
gtgctaattc 
acactgcgca 
cgagcaggct 
gtcagccttt 
agacactcgc 
cctcgattgc 
ccccgctcgt 
ggtttgcact 
aaaggtaacg 
tgttgggtga 
caatgtcact 
atcccggcac 
tacttataga 
ccgcttttct 
tgaacgtgct 
tggagagcac 
cgtcaactgc 
acaggcgcac 
ttccaaggcc 
caccccgggc 
gtacgaccat 
cgatcaagcg 
cctgacaagg 
cacagtggtc 
caaaggctgc 
caagcacctg 
aaaggctcat 
gagcaaggct 



agcgccttct 
ggcggtgttt 
gggacgtggc 
tgatgcccct 
gtgcctataa 
gcagttttgg 
ctgtcaaagc 
tcaaaggcga 
gcgccagcca 
cgcaagcctt 
tggtcgatct 
acccctacgt 
gcatggtcac 
tgctggcaat 
ggatccgcat 
cgccagagag 
acgcatcaat 
tcgcaagtga 
acgccagtgc 
gaagctgtct 
gtcctgatgc 
gggctgcaac 
gtgctgtgct 
aagcataggt 
gcccattacc 
ctcatccgac 
gtgcccgtcg 
ttcaaataga 
aatctttctt 
gattctggtc 
atccggtagc 
atcagcggca 
ctccaatggg 
accgggggcg 
tgattgggcg 
gtggggtcag 
gctcctggcg 
gcgtcatgag 
acaacagttc 
ccagattcag 
cgatctgtgt 
ttccattcgc 
gcagccactg 
ttaacgatga 
aatatcaatg 
ggatgggcag 
cgttaaaacg 
tggcggctgc 
tcgccaaggt 
cgtgccgttt 
gcccgaaaat 
cggttgatcc 
accagggatt 
ctgaaagccg 
ggcgaaaaat 
ttgacctctt 
atgcggatca 
gaggcattca 
ggtaagaaac 
caattcagta 
ctgcgctggc 
aaccagtgca 
caagccgagt 
gccaagaaag 
gatcggttga 



tccaaaccag 
cacccgaacc 
cgtcaccgtg 
cacgttgctt 
ccgttggtgg 
ccggcaggta 
catactcttt 
cgtcaaaaca 
gagcaacaac 
tgccgccggc 
gtccaactgt 
ttatttccca 
caccctgggc 
ggaccgcatc 
ggaagacctg 
gcagccgctg 
tggcggtctg 
aagtctgctc 
gtacctacgt 
gcctgaacca 
agagccatcg 
actgctttga 
ctgcccagcc 
gctgcgttct 
ctgggtagca 
tgcccttttg 
ccacccgcgc 
aatttgctct 
ttggcttcga 
ttgatgtatt 
agggtacgca 
tcgaccgacg 
acaaaggcgc 
agtccggaca 
ccagattgct 
atggacagcc 
tcgataatct 
gtgatcaagt 
ccttgaatca 
cgccattgcc 
gaagatgaac 
ggtccttacg 
ataaagccga 
ttttcctctg 
cgattcttgt 
cgagtttttg 
aaggaatgta 
tggagcctga 
tgggcgtggg 
tccctcgcgt 
cttggcggtg 
atatttttac 
caaacccgcc 
caacgtaagt 
tcctgcgccg 
gatgcgcttg 
cgcgaaaggc 
cccttggcct 
ctgaagccct 
atgatctgga 
gaaaacacga 
caccaccaga 
agaagcacat 
ccagcgctaa 
agctggcggc 



caaaggagtg 
gtgacctacg 
gacgtcatgc 
tgctcggcac 
gaagcgcgca 
ctgacgctga 
caacgtcatg 
gcaaaactcg 
ttccccaatg 
cagttcgaca 
cagggcggca 
cggctgttca 
tggttcaccc 
ggtacagacc 
tgcaacacca 
ctggctgacc 
agcaggcaga 
tgggcaccat 
cgcgcttgaa 
cgccaaaaag 
catggctatc 
gatcgcgatc 
cttttccaag 
gcaacttgtt 
atgcatcgcc 
cacggctctg 
ggccatagtc 
ggtgaaacgg 
tgttcgcagt 
gcgtggcgcc 
acgaatgaag 
aaaaggaaga 
ccgccttttc 
atgacgaggg 
ggatatacat 
ggtaagaacc 
tccagatagc 
ccggtctcat 
gggttatagc 
agaatcaaaa 
aacgaagtgt 
ttgtggcgtt 
tcttttgcct 
tggttcaaga 
aaaaatcgac 
gtaacgttgc 
tgcttaaaaa 
tgattcatct 
gaacccataa 
tggacacact 
atgaccgcaa 
tgcgacagaa 
ttaaaagctt 
aaaattttgc 
catgctccac 
cgacgtataa 
ctccgatacc 
cgaatcgatt 
ctgctactgc 
cttctacctg 
gcaggtctgg 
accgggttcg 
gaaccgtcgc 
actggcaccg 
agagtccggt 



ccgcaatgtc 
ttggctggtc 
tgatagaagg 
tgatcgtgct 
ccttgtgggg 
tcgatggcga 
tggcttactt 
acgggttact 
acatcctcaa 
gcatccgtct 
tggagcgcat 
gcacgctgtt 
cggcgatctc 
tgcaagcccc 



tgaaaagccc 
aaaacaggtt 
ttcgctcagt 
cacatcagca 
aggatcaaaa 
actcaaaaac 
aaggttttcc 
tgtcatgccc 
tgaataggcc 
ctgatagtcc 
acaccaattt 
agcagcaaac 
gtggacaagc 
cgcgcctatg 
gtcacgtaat 
ctggggttgt 
gcgcgcatcg 
gggatgaaaa 
cttatcgtgt 
aaaccgccct 
gaggctcttt 
gctgcaacga 
ccagatccgc 
caagcgcagc 
tgacgttgtc 
cctgttctgg 
gaccctcctg 
cgacaggccg 
cgtgatgcgg 
tcgtgagtgc 
cgttgttgca 
atgcctgcta 
ggacggcgag 
cggagggcag 
gctgctgccc 
atgtcagttc 
gagtgcggcc 
tatatgcgtg 
tccgctcgga 
aagtcgattc 
ccgtcgtagc 
tgccagagcg 
ctttccggac 
ctcttgctcc 
tttcatctaa 
acagaatgca 
acagttaagc 
aagaaaatac 
gcaaacaaat 
aacgacccga 



24360 
24420 
24480 
24540 
24600 
24660 
24720 
24780 
24840 
24900 
24960 
25020 
25080 
25140 
25200 
25260 
25320 
25380 
25440 
25500 
25560 
25620 
25680 
25740 
25800 
25860 
25920 
25980 
26040 
26100 
26160 
26220 
26280 
26340 
26400 
26460 
26520 
26580 
26640 
26700 
26760 
26820 
26880 
26940 
27000 
27060 
27120 
27180 
27240 
27300 
27360 
27420 
27480 
27540 
27600 
27660 
27720 
27780 
27840 
27900 
27960 
28020 
28080 
28140 
28200 
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tcagttccgt 
attccgatgg 
actcaccgcc 
aggctgcaag 
agctcatctg 
gagcttcatt 
gcgcgcccgc 
aaccttcggc 
gttgcaggcg 
taaccacatg 
gctcaggcgg 
ggccaagttc 
caataatcat 
cgccgtagag 
gagcgctgcc 
aacgccgatt 
ccgtccctgt 
ggcagcgacc 
aatcctgacg 
gtaatgcaca 
ggcgatgtaa 
aagcagtcac 
cgtgcggccg 
cgaaggactt 
ggcaaaacgc 
cgctcgcgtt 
cggccgatca 
tgatcgtcgg 
gacgcctgac 
ttgcgccgtc 
gcggctcggc 
ggctggtcag 
gcgcggtcat 
cgatggtgcc 
ggctggcgtc 
acaaggacga 
atcgt 



cgaggactga 
agcgtattgc 
ctgctccagc 
gttaggatgc 
aatatccgcg 
cagatagccc 
gtcacccagg 
gaagtgcagc 
gccgtcgcca 
gggaccgtct 
cgcgaggttg 
ttcgagtaac 
gtgggtctcc 
tgccgtgaat 
gactcaccgc 
atcttgccat 
gtgtcctgcc 
tgtggtgctt 
tcgttttcca 
ttgtcatcga 
gcccgatagt 
tgtcaatcat 
acatcgccct 
cagattcatg 
ctcgcctaac 
tggttatgac 
gggcgggctg 
tgcagccttc 
cctgcgcctc 
cattccgttc 
gacggtgccg 
ccgcaacgaa 
ggccgcgctg 
gggggtgttg 
caaaggccgt 
tgcgcancgt 



acagcgacgt 
aaggagcctg 
gcctggcgat 
ggctgcagca 
ccactcaatt 
agatagttgg 
cgaccgacgt 
cattgtacgt 
tgacggcgga 
tcgatcaccg 
gttttcgggt 
cacagaatgc 
gctgggtgag 
cgaatgtcct 
gaagctgacg 
cgctgaatgt 
ccagaaatcg 
caacgaccag 
gcattgtttt 
cgaagttgcc 
gatcggtcag 
ccagataacc 
aagcctcaca 
tcttcaagta 
cgactgattt 
accggtatta 
ggtttgaatg 
ggctcactgg 
ctgtcggtgc 
atggtcgccg 
gtgttcattg 
ctgatgatcg 
ctgcacacgc 
ctgctgatcg 
tttgacgaag 
gaagtggacg 



ttacgcgcca 
ttcaacagct 
acgcaggtct 
ttccctgcat 
cgtcgcccag 
ccagttcaga 
acaggttgag 
actcatcgta 
tcaggtaatc 
gggatttgcc 
cgcgctggta 
gctgcgaacg 
agtgggatgt 
ctggcgacct 
ctccactgcc 
gtagaacaca 
accctgtggc 
caaatcgatc 
gtagccggaa 
caactggtgc 
gttcatggcg 
cgcacagttt 
tctatgtact 
gcactacagc 
tcatctccgt 
tcgncggcgc 
cctacagcga 
ccagtggcta 
tgttcatcgc 
cgcgcttcct 
ccgaaatcgc 
tcagcggcca 
cgggcatctg 
gcaccttctt 
ctcaggatgt 
aaatgaaagc 



ccggtatggt 
cacttacttc 
ttcctggcat 
tttggcgaat 
cagataaggc 
gtgaatgcgc 
catcagcggc 
ggtggcgctg 
gacgatggcg 
cagcggatga 
gcgttttatc 
tgagttgttc 
ctagaaaaag 
cagacgcgtc 
gctttatcga 
ttttcggaaa 
gagcagttga 
ttgaaacgca 
aggctgatca 
caactacggt 
cgccctcctt 
taacagagtc 
ggcgcgacgc 
agcggctgac 
acttgtggca 
attgcccttc 
agggatgatc 
tatttccgac 
gggtgcgctg 
gctgggtatc 
cggcccctcg 
gttgctcgcc 
gcgctatatg 
cgtacctcct 
gctggagcaa 
tcatgacgag 



caggctgttc 
gcaaacgagt 
cgttgtaccc 
tcgccaatga 
gtcagcccca 
ggatgcaaag 
agaatggccg 
gcaggatccg 
ccagactcga 
atggccttca 
tcgtacggca 
aggtggtgga 
actgctgggc 
tgtcggcgca 
ttaccgacca 
aggtgatgcg 
agaccagccg 
agtcggggat 
gctcaccgtt 
cattcagaca 
caggtgctca 
atagggaact 
tggtttcaag 
acgcaaggtc 
accatgggcg 
atgacgctgc 
acggcttcgc 
cgtttcggac 
ggtacggcca 
gcggtgggtg 
cgtcgtgcgc 
tatgtgctca 
ctggcgatcg 
tcgccgngct 
ctgcgcagca 
caggcgcgca 



28260 
28320 
28380 
28440 
28500 
28560 
28620 
28680 
28740 
28800 
28860 
28920 
28980 
29040 
29100 
29160 
29220 
29280 
29340 
29400 
29460 
29520 
29580 
29640 
29700 
29760 
29820 
29880 
29940 
30000 
30060 
30120 
30180 
30240 
30300 
30360 
30365 



Several undefined nucleotides exist in SEQ. ID. No. 1, however these appear to be 
present in intergenic regions. The CEL of Pseudomonas syringae pv. tomato DC3000 
contains a number of open reading frames (ORFs). Two of the products encoded by 
the CEL are Hrp W and AvrE, both of which are known. An additional 1 0 products 
are produced by ORF1-10, respectively, as shown in Figure 3. The nucleotide 
sequences for a number of these ORFs and their encoded protein or polypeptide 
products are provided below. 

The DNA molecule of ORF3 from the Pseudomonas syringae pv. 



tomato DC3000 CEL has a nucleotide sequence (SEQ. ID. No. 2) as follows: 



atgatcagtt cgcggatcgg cggggccggt 
cacgatactg ttcccgccca gacagctcac 
ccgctgactc ccgatcagtc agggtcacac 
cggctgaatg tcgcggctcg acacacacag 
acggctccgg tcagcggcgc gccgatgatc 
ctgctgcagg ccgagccttt gccttttgaa 
tatcaactga agcagtttca gggctcggac 



ggcgtcaaac tcagccgggt aaaccagcag 60 
ccaaatgcag tcactgcagg catgaatccg 120 
gcgacagaaa gctcgtctgc cggcgcggcg 180 
cttttgcagg ccttcaaggc tgagcatggg 240 
agttcgcgtg ctgcgttgtt gatcggtagt 300 
gtcatggccg agaaattgtc tcctgagcgc 360 
ttgcagcagc ggctggaaaa attcgcccag 420 
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ccgggtcaga 
gtcgctgatc 
cagcatgcaa 
gccggtcgtg 
gagcacatca 
ctgcacgctg 
gatttcgtca 
gacaacgtcg 
gggccgattg 
gtgcgtttgt 
aacatgctca 
agcgtggtca 
aacatggtgc 
aaaagcgaac 
ttcgtgaaaa 
ccggtaggtg 
ctcaacgatg 
tttggcgggg 
gacccgcaag 
aaggacctgc 
aaggctcttt 
caggctgaag 
gcagcgacgg 
gaagccaagg 
gagacgcttt 



taccggataa 
aactggagca 
aagcggacaa 
caagcaaggc 
gtgcgctgga 
acaggcaggc 
agtcgccgga 
tcagcgatct 
tcgcggctgc 
ccaccagcga 
aggcttcgat 
agccgatgtt 
caatgaaggc 
acggtgagct 
gtgaacgcgc 
agctgatggc 
ttcaccagat 
cggtgtctgc 
ggcgcaaaat 
tcaaaggtat 
cgggtattca 
gcgcaagtgg 
gttcggtgtc 
cgttgaaagc 
ga 



agccgaggtc 
ctttcaactg 
ggcgacgctt 
aatcggcgaa 
tctcactctg 
gctggtcgac 
ggccaagcgc 
cgtcactgcc 
ggttccgcag 
caagctgcga 
aatcgggatg 
tcaggccgcc 
tgtggatacc 
ggtcaaaaaa 
gctgctgaac 
ttacagtgcc 
caatgggcag 
cagttcgcaa 
tccggtattt 
ggacctgcgc 
gagttctgca 
cacgctcagt 
ctatctgtcc 
ggcaggcatg 



gggcaactga 
atgcatgacg 
gccgtcagtc 
ggcctgagca 
caagatgccg 
gccaaaacca 
cttgcttcgg 
cgtaacacgg 
ttcttgtctt 
gacacgattc 
gtggcgggca 
ttgcagaaga 
aatacggtta 
acgcccgagg 
cagaagaagg 
ttcggtggtt 
acgctgagtg 
acgctgctgc 
accccggacc 
gagccgtcgg 
ctgacctcgg 

gcgggggcta 

acgttgtaca 
ggcggtgcaa 



tcaagggttt 
cttcgcccgc 
agactgccct 
acagcatcgc 
aacagggcaa 
ccctggtagg 
tcgccgcaca 
tgggtggctg 
caatgacaca 
ccgagaccag 
ttgctcacga 
ctggcctcaa 
ttcctgaccc 
aagtcgctca 
ttcagggttc 
ctcaggctgt 
caagagctct 
aattgaagtc 
gcgccgagag 
tacgcaccac 
cactgccgcc 
ttttgcgcaa 
ccaaccagtc 
cacctatgct 



tgctcagtcg 480 
aacggtaggc 540 
tggcgaatac 600 
gtcgctggat 660 
caaggagtct 720 
tttgcacgcc 780 
tacgcaactg 840 
gaaaggtgca 900 
cttgggttat 960 
cagcgacgcc 1020 
gacggtcaac 1080 
cgaacgcctg 114 0 
cttcgagctg 1200 
ggacaaggcg 1260 
gtccacccat 1320 
gcgccagatg 1380 
ggcatccggt 144 0 
gaattatgtc 1500 
cgatctgaaa 1560 
gttctacagc 1620 
tgtgaccgct 1680 
catggccctg 1740 
ggttaccgca 1800 
ggaccgtacc 1860 

1872 
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The protein or polypeptide encoded by Pto DC3000 CEL ORF3 has an amino acid 
sequence (SEQ. ID. No. 3) as follows: 



Met lie Ser Ser Arg lie Gly Gly Ala Gly Gly Val Lys Leu Ser Arg 
15 10 15 

Val Asn Gin Gin His Asp Thr Val Pro Ala Gin Thr Ala His Pro Asn 

20 25 30 



40 



Ala Val Thr Ala Gly Met Asn Pro Pro Leu Thr Pro Asp Gin Ser Gly 

35 40 45 

Ser His Ala Thr Glu Ser Ser Ser Ala Gly Ala Ala Arg Leu Asn Val 
50 55 60 



45 



Ala Ala Arg His Thr Gin Leu Leu Gin Ala Phe Lys Ala Glu His Gly 
65 70 75 80 

Thr Ala Pro Val Ser Gly Ala Pro Met lie Ser Ser Arg Ala Ala Leu 

85 90 95 



50 



Leu lie Gly Ser Leu Leu Gin Ala Glu Pro Leu Pro Phe Glu Val Met 

100 105 110 



Ala Glu Lys Leu 

115 

55 Ser Asp Leu Gin 

130 

Pro Asp Lys Ala 
145 

60 

Val Ala Asp Gin 



Ser Pro Glu Arg 

120 

Gin Arg Leu Glu 

135 

Glu Val Gly Gin 
150 

Leu Glu His Phe 
165 



Tyr Gin Leu Lys 



Lys Phe Ala Gin 

140 

Leu lie Lys Gly 

155 

Gin Leu Met His 
170 



Gin Phe Gin Gly 
125 

Pro Gly Gin He 



Phe Ala Gin Ser 

160 

Asp Ala Ser Pro 

175 



19 



Ala Thr Val Gly 

180 

Ser Gin Thr Ala 

195 

Gly Glu Gly Leu 
210 

Ala Leu Asp Leu 
225 

Leu His Ala Asp 



Gly Leu His Ala 

260 

Ser Val Ala Ala 

275 

Thr Ala Arg Asn 
290 

Ala Ala Ala Val 
305 

Val Arg Leu Ser 



Ser Ser Asp Ala 

340 

Gly He Ala His 

355 

Ala Ala Leu Gin 
370 

Met Lys Ala Val 
385 

Lys Ser Glu His 



Gin Asp Lys Ala 

420 

Lys Val Gin Gly 

435 

Ser Ala Phe Gly 
450 

His Gin He Asn 
465 

Phe Gly Gly Ala 



Ser Asn Tyr Val 

500 



Gin His Ala Lys 



Leu Gly Glu Tyr 

200 

Ser Asn Ser He 

215 

Thr Leu Gin Asp 
230 

Arg Gin Ala Leu 
245 

Asp Phe Val Lys 



His Thr Gin Leu 

280 

Thr Val Gly Gly 

295 

Pro Gin Phe Leu 
310 

Thr Ser Asp Lys 
325 

Asn Met Leu Lys 



Glu Thr Val Asn 

360 

Lys Thr Gly Leu 

375 

Asp Thr Asn Thr 
390 

Gly Glu Leu Val 
405 

Phe Val Lys Ser 



Ser Ser Thr His 

440 

Gly Ser Gin Ala 

455 

Gly Gin Thr Leu 
470 

Val Ser Ala Ser 
485 

Asp Pro Gin Gly 



Ala Asp Lys Ala 
185 

Ala Gly Arg Ala 



Ala Ser Leu Asp 

220 

Ala Glu Gin Gly 

235 

Val Asp Ala Lys 
250 

Ser Pro Glu Ala 
265 

Asp Asn Val Val 



Trp Lys Gly Ala 

300 

Ser Ser Met Thr 

315 

Leu Arg Asp Thr 
330 

Ala Ser He He 
345 

Ser Val Val Lys 



Asn Glu Arg Leu 

380 

Val He Pro Asp 

395 

Lys Lys Thr Pro 
410 

Glu Arg Ala Leu 
425 

Pro Val Gly Glu 



Val Arg Gin Met 

460 

Ser Ala Arg Ala 

475 

Ser Gin Thr Leu 
490 

Arg Lys He Pro 
505 



Thr Leu Ala Val 
190 

Ser Lys Ala He 
205 

Glu His lie Ser 



Asn Lys Glu Ser 

240 

Thr Thr Leu Val 

255 

Lys Arg Leu Ala 
270 

Ser Asp Leu Val 
285 

Gly Pro He Val 



His Leu Gly Tyr 

320 

He Pro Glu Thr 

335 

Gly Met Val Ala 
350 

Pro Met Phe Gin 
365 

Asn Met Val Pro 



Pro Phe Glu Leu 

400 

Glu Glu Val Ala 

415 

Leu Asn Gin Lys 
430 

Leu Met Ala Tyr 
445 

Leu Asn Asp Val 



Leu Ala Ser Gly 

480 

Leu Gin Leu Lys 

495 

Val Phe Thr Pro 
510 



Asp Arg Ala Glu Ser Asp Leu Lys Lys Asp Leu Leu Lys Gly Met Asp 

515 520 525 
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Leu Arg Glu Pro Ser Val Arg Thr Thr Phe Tyr Ser Lys Ala Leu Ser 

530 535 540 

Gly lie Gin Ser Ser Ala Leu Thr Ser Ala Leu Pro Pro Val Thr Ala 
545 550 555 560 

Gin Ala Glu Gly Ala Ser Gly Thr Leu Ser Ala Gly Ala lie Leu Arg 

565 570 575 

Asn Met Ala Leu Ala Ala Thr Gly Ser Val Ser Tyr Leu Ser Thr Leu 

580 585 590 

Tyr Thr Asn Gin Ser Val Thr Ala Glu Ala Lys Ala Leu Lys Ala Ala 

595 600 605 

Gly Met Gly Gly Ala Thr Pro Met Leu Asp Arg Thr Glu Thr Leu 
610 615 620 



The DNA molecule of ORF4 from the Pseudomonas syringae pv 
tomato DC3000 CEL has a nucleotide sequence (SEQ. ID. No. 4) as follows: 



atgaccaaca atgaccagta ccacaccctt atcaacgaaa tctgcgcact cagcctgatt 60 
tccacacctg aacgtttcta tgaatctgcc aatttcaaaa tcagcgaagt ggacttcacc 120 
ctgcagtttc aggaccgcga cgaaggccgt gccgttctga tctacggtga catgggcgcg 180 
ttgcccgcgc gcggccgtga gagcgcgttg ctggcgttga tggacatcaa ctttcacatg 240 
ttcgcgggcg cccacagccc ggcattttcc tttaatgcgc agaccggtcg tgtgctgctg 300 
atgggctctg tggcccttga acgagcctct gccgaaggcg tgctgttgtt gatgaagtcg 360 
ttttccgacc tggccaaaga gtggcgcgag catggattca tggggcaggc cacaactgca 420 
ggctcctcga cggaccaacc tgttgcccca gcagccaaac gcgagagcct ttcggctcct 480 
gggagattcc aatga 495 



The protein or polypeptide encoded by Pto DC3000 CEL ORF4 has an amino acid 
sequence (SEQ. ID. No. 5) as follows: 



Met Thr Asn Asn 
1 

Leu Ser Leu lie 

20 

Lys lie Ser Glu 

35 

Gly Arg Ala Val 
50 

Gly Arg Glu Ser 
65 

Phe Ala Gly Ala 

Arg Val Leu Leu 

100 

Gly Val Leu Leu 

115 



Asp Gin Tyr His 
5 

Ser Thr Pro Glu 

Val Asp Phe Thr 

40 

Leu lie Tyr Gly 

55 

Ala Leu Leu Ala 
70 

His Ser Pro Ala 
85 

Met Gly Ser Val 

Leu Met Lys Ser 

120 



Thr Leu lie Asn 
10 

Arg Phe Tyr Glu 
25 

Leu Gin Phe Gin 



Asp Met Gly Ala 

60 

Leu Met Asp lie 

75 

Phe Ser Phe Asn 
90 

Ala Leu Glu Arg 
105 

Phe Ser Asp Leu 



Glu lie Cys Ala 

15 

Ser Ala Asn Phe 
30 

Asp Arg Asp Glu 
45 

Leu Pro Ala Arg 



Asn Phe His Met 

80 

Ala Gin Thr Gly 

95 

Ala Ser Ala Glu 
110 

Ala Lys Glu Trp 
125 



-21 - 



10 



Arg Glu His Gly Phe Met Gly Gin Ala Thr Thr Ala Gly Ser Ser Thr 
130 135 140 

Asp Gin Pro Val Ala Pro Ala Ala Lys Arg Glu Ser Leu Ser Ala Pro 
145 150 155 160 

Gly Arg Phe Gin 



The DNA molecule of ORF5 from the Pseudomonas syringae pv 
tomato DC3000 CEL has a nucleotide sequence (SEQ. ID. No. 6) as follows: 



15 



t = 



20 



25 



t: 



30 



35 



40 



atgcacatca 
gcgtccgacg 
gagataaatg 
ccggccgatt 
ctgatcgaga 
gacaccttcg 
cgggcgacgc 
ggcgattggc 
ggcgccatgg 
ctgagcgcct 
ccaagccttg 
aacgccgtac 
gtggaccttg 
ctgctcagtg 
gataaagagc 
atcaaatcgg 
ccactggata 
ctgacccaaa 
atggcgacga 
ctggcaggtt 
gcggtgaaaa 
acaggctacg 
gaggcgatca 
ccggctcgtg 
tttcggccta 



accgacgcgt 
cgtctcttgc 
cgattgccga 
cggctgatgg 
cgcgcgccag 
ccaaggcgga 
cctttgccat 
tgccggctcc 
accaggtggg 
cgccggacag 
ctcgacaggt 
gtaccgtatt 
gtgtatcgat 
tgcagtcgcg 
ccaaggctca 
ccagctactc 
tggcgaccga 
acggtctggc 
aaaatatcac 
cggcagccgt 
aagccgagtc 
tagccgacca 
cccataccgg 
aagctgatat 
tgcggtcgta 



ccaacaaccg 
ctccagctct 
ttacctgaca 
ccaagctgca 
ccgcctgcac 
aaagctcgac 
ggcctcgttg 
gctcaaaccg 
caccaagatg 
gctccacgat 
tctggacacg 
ggctccggca 
ggcgggtggt 
tgatcaccag 
actgagcgaa 
gggtgcggcg 
cgcaatgggt 
cctggcgggt 
cgacccggcg 
tttcgcaggc 
gttcatacag 
gaccgtcaaa 
cgccagcttg 
agaagagggg 
a 



cctgtgactg 
gtgcgatctg 
gatcatgtgt 
gttgacgtac 
ttcgaagggg 
cgattggcga 
cttcagtaca 
ctgaccccgc 
atggaccgcg 
gcgatggccg 

ggggttgcgg 

ctggcgtcca 
ctggctgcca 
cgtggcggtg 
gaaaacgact 
ctcaacgctg 
gcggtaagaa 
ggctttgcag 
accaaggccg 
tggaccacgg 
gacacggtga 
ctggcgaaga 
cgcaatacgg 
ggcacggcgg 



cgacggatag 
tcagctccga 
tcgctgcgca 
acaatgcgca 
aaaccccggc 
cgactacatc 
tgcagcctgc 
tcatttccgg 
cgacgggtga 
cttcggtgaa 
ttcagacgta 
gacccgccgt 
acgcaggctt 
cattagtgct 
ggctcgaggc 
gcaagcggat 
gcctggtgtc 
gggtaggcaa 
cggtcagtca 
ccgcgctgac 
aatcgactgc 
ccgtcaaaga 
tcaataacct 
cttctccaag 



ctttcggaca 60 
tcagcaacgc 120 
taaactgccg 180 
gatcactgcg 240 
aaccatcgcc 300 
aggcgcgttg 360 
gatcaacaag 420 
agcgctgtcg 4 80 
tctgcattac 540 
gcgccactcg 600 
ctcggcgcgc 660 
gcagggtgct 720 
tggcaaccgc 780 
cggtttgaag 84 0 
ttataaagca 900 
ggccggtctg 960 
agcgtccagc 1020 
gttgcaggag 1080 
gttgaccaac 1140 
aaccgatccc 1200 
atccagtacc 1260 
catgggcggg 1320 
gcgtcaacgc 1380 
tgaaataccg 144 0 

1461 



45 



The protein or polypeptide encoded by Pto DC3000 CEL ORF5, now known as 
HopPtoA, has an amino acid sequence (SEQ. ID. No. 7) as follows: 



AA ^ 
Met His 

1 



1 

lie Asn Arg Arg Val Gin Gin Pro Pro Val Thr Ala Thr Asp 

5 10 15 



50 



Ser Phe Arg Thr Ala Ser Asp Ala Ser Leu Ala Ser Ser Ser Val Arg 

20 25 30 

Ser Val Ser Ser Asp Gin' Gin Arg Glu lie Asn Ala lie Ala Asp Tyr 

35 40 45 



55 



Leu Thr Asp His Val Phe Ala Ala His Lys Leu Pro Pro Ala Asp Ser 
50 55 60 



Ala Asp Gly Gin Ala Ala Val Asp Val His Asn Ala Gin lie Thr Ala 
65 70 75 80 



22 



Leu lie Glu Thr Arg Ala Ser Arg Leu His Phe Glu Gly Glu Thr Pro 

85 90 95 

Ala Thr lie Ala Asp Thr Phe Ala Lys Ala Glu LysLeu Asp Arg Leu 

100 105 110 

Ala Thr Thr Thr Ser Gly Ala Leu Arg Ala Thr Pro Phe Ala Met Ala 

115 120 125 

Ser Leu Leu Gin Tyr Met Gin Pro Ala lie Asn Lys Gly Asp Trp Leu 
130 135 140 

Pro Ala Pro Leu Lys Pro Leu Thr Pro Leu lie Ser Gly Ala Leu Ser 

145 150 155 160 

Gly Ala Met Asp Gin Val Gly Thr Lys Met Met Asp Arg Ala Thr Gly 

165 170 175 

Asp Leu His Tyr Leu Ser Ala Ser Pro Asp Arg Leu His Asp Ala Met 

180 185 190 

Ala Ala Ser Val Lys Arg His Ser Pro Ser Leu Ala Arg Gin Val Leu 

195 200 205 

Asp Thr Gly Val Ala Val Gin Thr Tyr Ser Ala Arg Asn Ala Val Arg 
210 215 220 

Thr Val Leu Ala Pro Ala Leu Ala Ser Arg Pro Ala Val Gin Gly Ala 

225 * 230 235 240 

Val Asp Leu Gly Val Ser Met Ala Gly Gly Leu Ala Ala Asn Ala Gly 

245 250 255 

Phe Gly Asn Arg Leu Leu Ser Val Gin Ser Arg Asp His Gin Arg Gly 

260 265 270 

Gly Ala Leu Val Leu Gly Leu Lys Asp Lys Glu Pro Lys Ala Gin Leu 

275 280 285 

Ser Glu Glu Asn Asp Trp Leu Glu Ala Tyr Lys Ala lie Lys Ser Ala 
290 295 300 

Ser Tyr Ser Gly Ala Ala Leu Asn Ala Gly Lys Arg Met Ala Gly Leu 
305 310 315 320 

Pro Leu Asp Met Ala Thr Asp Ala Met Gly Ala Val Arg Ser Leu Val 

325 330 335 

Ser Ala Ser Ser Leu Thr Gin Asn Gly Leu Ala Leu Ala Gly Gly Phe 

340, 345 350 

Ala Gly Val Gly Lys Leu Gin Glu Met Ala Thr Lys Asn lie Thr Asp 

355 * 360 365 

Pro Ala Thr Lys Ala Ala Val Ser Gin Leu Thr Asn Leu Ala Gly Ser 
370 375 380 

Ala Ala Val Phe Ala Gly Trp Thr Thr Ala Ala Leu Thr Thr Asp Pro 

385 390 395 400 

Ala Val Lys Lys Ala Glu Ser Phe lie Gin Asp Thr Val Lys Ser Thr 

405 410 415 



Ala Ser Ser Thr Thr Gly Tyr Val Ala Asp Gin Thr Val Lys Leu Ala 

420 425 430 



23 



Lys Thr Val Lys Asp Met Gly Gly Glu Ala lie Thr His Thr Gly Ala 

435 440 445 

Ser Leu Arg, Asn Thr Val Asn Asn Leu Arg, Gin Arg Pro Ala Arg Glu 
450 455 ' 460 

Ala Asp lie Glu Glu Gly Gly Thr Ala Ala Ser Pro Ser Glu lie Pro 

465 470 475 480 

Phe Arg Pro Met Arg Ser 

485 



The DNA molecule of ORF6 from the Pseudomonas syringae pv. 
tomato DC3000 CEL has a nucleotide sequence (SEQ. ID. No. 8) as follows: 



atgtctggtc 
tggtcgctgt 
gaaggcaaag 
gtgctgatca 
tggggcgcaa 
ggcgaacggg 
tacttgcgtg 
ttactgtcgc 
ctcaatggct 
cgtctgaccc 
cgcatcgcca 
ctgttctgca 
atctccacgg 
gccccgttcg 
aagaacctgc 
agccccgtac 
aggttagggg 
tcagttgcag 



ctttcgagaa 
tctggcttct 
gcatcgactt 
gctttcgcaa 
tggtcaacac 
atgacctcaa 
ccctgcgcgc 
ccgacgagat 
ctgctgcggt 
gcctggaatc 
acacgccact 
tcctgatgcc 
tggtaggctg 
gcaacagtca 
aatcgatgtt 
cgtggcgcgt 
aaggcgcgag 
acgttgctcc 



aaaatggcgg 
gctctgggac 
ccccctgatg 
ctcgagtgcc 
ttcacgcagt 
caaccctgtc 
gcacctcaaa 
tcagcgcgcc 
tatctcgcaa 
gaccatggtc 
gccctacccc 
gctgagcatg 
catgctgctg 
gcaccggatc 
ctcttcgcca 
ggccaacgca 
gcttatcgca 
gtgccacgcc 



tgtttcaccc 
gtggccgtca 
cccctcacgt 
tataaccgtt 
tttggccggc 
aaagccatac 
ggcgacgtca 
agccagagca 
gcctttgccg 
gatctgtcca 
tacgtttatt 
gtcaccaccc 
gcaatggacc 
cgcatggaag 
gagaggcagc 
tcaattggcg 
agtgaaagtc 
agtgcgtacc 



gaaccgtgac 
ccgtggacgt 
tgctttgctc 
ggtgggaagc 
aggtactgac 
tctttcaacg 
aaacagcaaa 
acaacttccc 
ccggccagtt 
actgtcaggg 
tcccacggct 
tgggctggtt 
gcatcggtac 
acctgtgcaa 
cgctgctggc 
gtctgagcag 
tgctctgggc 
tacgtcgcgc 



ctacgttggc 60 
catgctgata 120 
ggcactgatc 180 
gcgcaccttg 240 
gctgatcgat 300 
tcatgtggct 360 
actcgacggg 42 0 
caatgacatc 480 
cgacagcatc 540 
cggcatggag 600 
gttcagcacg 660 
caccccggcg 720 
agacctgcaa 780 
caccatcgaa 840 
tgacctgaaa 900 
gcagaaaaac 960 
accatttcgc 1020 
ttga 1074 



The protein or polypeptide encoded by Pto DC3000 CEL ORF6 has an amino acid 



sequence (SEQ. ID. No. 9) as follows: 



Met Ser Gly Pro 
1 

Thr Tyr Val Gly 

20 

Val Thr Val Asp 

35 

Leu Met Pro Leu 
50 

Phe Arg Asn Ser 
65 

Trp Gly Ala Met 

Thr Leu lie Asp 

100 



Phe Glu Lys Lys 
5 

Trp Ser Leu Phe 

Val Met Leu lie 

40 

Thr Leu Leu Cys 

55 

Ser Ala Tyr Asn 
70 

Val Asn Thr Ser 
85 

Gly Glu Arg Asp 



Trp Arg Cys Phe 
10 

Trp Leu Leu Leu 
25 

Glu Gly Lys Gly 

Ser Ala Leu lie 

60 

Arg Trp Trp Glu 

75 

Arg Ser Phe Gly 
90 

Asp Leu Asn Asn 
105 



Thr Arg Thr Val 

15 

Trp Asp Val Ala 
30 

lie Asp Phe Pro 
45 

Val Leu lie Ser 

Ala Arg Thr Leu 

80 

Arg Gin Val Leu 

95 

Pro Val Lys Ala 
110 



24 



lie Leu Phe Gin Arg His Val Ala Tyr Leu Arg Ala Leu Arg Ala His 

115 120 125 

Leu Lys Gly Asp Val Lys Thr Ala Lys Leu Asp Gly Leu Leu Ser Pro 
130 135 140 

Asp Glu lie Gin Arg Ala Ser Gin Ser Asn Asn Phe Pro Asn Asp lie 
145 150 155 160 

Leu Asn Gly Ser Ala Ala Val lie Ser Gin Ala Phe Ala Ala Gly Gin 

165 170 175 

Phe Asp Ser lie Arg Leu Thr Arg Leu Glu Ser Thr Met Val Asp Leu 

180 185 190 

Ser Asn Cys Gin Gly Gly Met Glu Arg lie Ala Asn Thr Pro Leu Pro 

195 200 205 

Tyr Pro Tyr Val Tyr Phe Pro Arg Leu Phe Ser Thr Leu Phe Cys lie 
210 215 220 

Leu Met Pro Leu Ser Met Val Thr Thr Leu Gly Trp Phe Thr Pro Ala 
225 230 235 240 

lie Ser Thr Val Val Gly Cys Met Leu Leu Ala Met Asp Arg lie Gly 

245 250 255 

Thr Asp Leu Gin Ala Pro Phe Gly Asn Ser Gin His Arg lie Arg Met 

260 265 270 

Glu Asp Leu Cys Asn Thr lie Glu Lys Asn Leu Gin Ser Met Phe Ser 

275 280 285 

Ser Pro Glu Arg Gin Pro Leu Leu Ala Asp Leu Lys Ser Pro Val Pro 
290 295 300 

Trp Arg Val Ala Asn Ala Ser lie Gly Gly Leu Ser Arg Gin Lys Asn 
305 310 315 320 

Arg Leu Gly Glu Gly Ala Arg Leu lie Ala Ser Glu Ser Leu Leu Trp 

325 330 335 

Ala Pro Phe Arg Ser Val Ala Asp Val Ala Pro Cys His Ala Ser Ala 

340 345 350 

Tyr Leu Arg Arg Ala 

355 



The DNA molecule of ORF7 from the Pseudomonas syringae pv 
tomato DC3000 CEL has a nucleotide sequence (SEQ. ID. No. 10) as follows: 



atgtatatcc 
ccctcgtcat 
gaaaaggcgg 
tcttcctttt 
cttcattcgt 
ggcgccacgc 
actgcgaaca 
ccgtttcacc 
gactatggcc 
cagagccgtg 



agcaatctgg 
tgtccggact 
gcgcctttgt 
cgtcggtcga 
tgcgtaccct 
aatacatcaa 
tcgaagccaa 
agagcaaatt 
gcgcgggtgg 
caaaagggca 



cgcccaatca 
cgcccccggt 
cccattggag 
tgccgctgat 
gctaccggat 
gaccagaatc 
aagaaagatt 
tctatttgaa 
cgacgggcac 
gtcggatgag 



ggggttgccg 
tcgtcggatg 
gggcatgaag 
cttcccagtc 
ctgatggtct 
aaggctatgg 
gcccaagagc 
aaaactatcg 
gcttgtctgg 
gccttctttc 



ctaagacgca 
cgttcgcccg 
aggtcttttt 
ccgagcaggt 
ctatcgcctc 
cggacaacag 
acggctgtca 
atgatagagc 
ggctatcagt 
acaaactgga 



acacgataag 60 
ttttcatccc 120 
cgatgcgcgc 180 
acaaccccag 24 0 
attacgtgac 300 
cataggcgcg 360 
gcttgtccac 420 
gtttgctgct 480 
aaattggtgt 540 
ggactatcag 600 



25 



ggcgatgcat 
aacaagttgc 
cttggaaaag 
gatcgcgatc 
gatagccatg 
ctttttggcg 
ttcaagcgcg 
atggtgccca 



tgctacccag 
agaacgcagc 
ggctgggcag 
tcaaagcagt 
cgatggctct 
tggttcaggc 
acgtaggtac 
gagcagactt 



ggtaatgggc 
acctatgctt 
agcacagcac 
gttgcagccc 
gcatcaggac 
agacagcttc 
gcactggcgt 
tcacttgcga 



ttccagcata 
ctggacacac 
gcgcactatg 
ggtaaagacc 
agtcagggat 
agcaacatga 
ggcacggagc 
taa 



tcgagcagca 
ttcccaagtt 
cggttgctct 
agatgcttct 
gtctgcattt 
gccattttct 
aacgtctgca 



ggcctattca 660 
gggcatgaca 72 0 
ggaaaacctt 780 
gtttttgagt 840 
ttttgatcct 900 

tgctgatgtg 9^o 

actgagcgaa 1020 

1053 



The protein or polypeptide encoded by Pto DC3000 CEL ORF7 has an amino ac 



sequence (SEQ. ED. No. 1 1) as follows: 



Met Tyr lie Gin 
1 

Gin His Asp Lys 

20 

Asp Ala Phe Ala 

35 

Leu Glu Gly His 
50 

Ser Val Asp Ala 
65 

Leu His Ser Leu 



Ser Leu Arg Asp 

100 

Met Ala Asp Asn 

115 

Lys lie Ala Gin 
130 

Ser Lys Phe Leu 
145 

Asp Tyr Gly Arg 



Val Asn Trp Cys 

180 

Phe His Lys Leu 

195 

Met Gly Phe Gin 
210 

Asn Ala Ala Pro 
225 

Leu Gly Lys Gly 



Leu Glu Asn Leu 

260 



Gin Ser Gly Ala 
5 

Pro Ser Ser Leu 



Arg Phe His Pro 

40 

Glu Glu Val Phe 

55 

Ala Asp Leu Pro 
70 

Arg Thr Leu Leu 
85 

Gly Ala Thr Gin 



Ser lie Gly Ala 

120 

Glu His Gly Cys 

135 

Phe Glu Lys Thr 
150 

Ala Gly Gly Asp 
165 

Gin Ser Arg Ala 



Glu Asp Tyr Gin 

200 

His lie Glu Gin 

215 

Met Leu Leu Asp 
230 

Leu Gly Arg Ala 
245 

Asp Arg Asp Leu 



Gin Ser Gly Val 
10 

Ser Gly Leu Ala 
25 

Glu Lys Ala Gly 



Phe Asp Ala Arg 

60 

Ser Pro Glu Gin 

75 

Pro Asp Leu Met 
90 

Tyr lie Lys Thr 
105 

Thr Ala Asn lie 



Gin Leu Val His 

140 

lie Asp Asp Arg 

155 

Gly His Ala Cys 
170 

Lys Gly Gin Ser 
185 

Gly Asp Ala Leu 



Gin Ala Tyr Ser 

220 

Thr Leu Pro Lys 

235 

Gin His Ala His 
250 

Lys Ala Val Leu 
265 



Ala Ala Lys Thr 

15 

Pro Gly Ser Ser 
30 

Ala Phe Val Pro 
45 

Ser Ser Phe Ser 



Val Gin Pro Gin 

80 

Val Ser lie Ala 

95 

Arg lie Lys Ala 
110 

Glu Ala Lys Arg 
125 

Pro Phe His Gin 



Ala Phe Ala Ala 

160 

Leu Gly Leu Ser 

175 

Asp Glu Ala Phe 
190 

Leu Pro Arg Val 
205 

Asn Lys Leu Gin 



Leu Gly Met Thr 

240 

Tyr Ala Val Ala 

255 

Gin Pro Gly Lys 
270 



-26- 



Asp Gin Met Leu Leu Phe Leu Ser Asp Ser His Ala Met Ala Leu His 

275 280 285 

Gin Asp Ser Gin Gly Cys Leu His Phe Phe Asp Pro Leu Phe Gly Val 
290 295 300 

Val Gin Ala Asp Ser Phe Ser Asn Met Ser His Phe Leu Ala Asp Val 
305 310 315 320 

Phe Lys Arg Asp Val Gly Thr His Trp Arg Gly Thr Glu Gin Arg Leu 

325 330 335 

Gin Leu Ser Glu Met Val Pro Arg Ala Asp Phe His Leu Arg 

340 345 350 



The DNA molecule of ORF8 from the Pseudomonas syringae pv 

+ 

tomato DC3000 CEL has a nucleotide sequence (SEQ. ID. No. 12) as follows: 



atgcggcctg 
gcgcaggagg 
ttgccagaac 
caggacaacg 
ggcgctgcgc 
atggcggatc 
gtctcgttgc 
cagaaagagc 



tcgaggcaaa 
gtcaacgcca 
aggacacttc 
tcattttgat 
ttggctataa 
tggatgagac 
agcgctatct 
ctcggttctt 



agatcggctt 
caacgtaagg 
gttgttcatc 
tctggcaatg 
ccctgattca 
cggacttgat 
ggaagattat 
accggctgtc 



tatcagtggc 
accgcgaatg 
ttcacacaga 
gcgctgaatc 
agggaactgt 
cacctcatga 
cgacgccagg 
catctgaccc 



tgcgcaatcg 
gaagcgagtg 
tcgaaaggct 
tggagcctgc 
tgttgcgcag 
cgcgaattag 
agcaagccgg 
cacgaacgtt 



aggcatcgat 60 
tctgctctgg 120 
gacgatgccg 180 
tcgcacaggt 24 0 
tgtgcactca 300 
cacattggcc 360 
aaaaaccgcc 420 
catgacctga 480 



The protein or polypeptide encoded by Pto DC3000 CEL ORF8 has an amino acid 



sequence (SEQ. ID. No. 13) as follows: 



Met Arg Pro Val 
1 

Arg Gly lie Asp 

20 

Asn Gly Ser Glu 

35 

Phe lie Phe Thr 
50 

lie Leu lie Leu 
65 

Gly Ala Ala Leu 

Ser Val His Ser 

100 

Met Thr Arg lie 

115 



Glu Ala Lys Asp 
5 

Ala Gin Glu Gly 

Cys Leu Leu Trp 

40 

Gin lie Glu Arg 

55 

Ala Met Ala Leu 
70 

Gly Tyr Asn Pro 
85 

Met Ala Asp Leu 

Ser Thr Leu Ala 

120 



Arg Leu Tyr Gin 
10 

Gin Arg His Asn 
25 

Leu Pro Glu Gin 



Leu Thr Met Pro 

60 

Asn Leu Glu Pro 

75 

Asp Ser Arg Glu 
90 

Asp Glu Thr Gly 
105 

Val Ser Leu Gin 



Trp Leu Arg Asn 

15 

Val Arg Thr Ala 
30 

Asp Thr Ser Leu 
45 

Gin Asp Asn Val 



Ala Arg Thr Gly 

80 

Leu Leu Leu Arg 

95 

Leu Asp His Leu 
110 

Arg Tyr Leu Glu 
125 



Asp Tyr Arg Arg Gin Glu Gin Ala Gly Lys Thr Ala Gin Lys Glu Pro 
130 135 140 
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Arg Phe Leu Pro Ala Val His Leu Thr Pro Arg Thr Phe Met Thr 
145 150 155 



The DNA molecule of ORF9 from the Pseudomonas syringae pv 
tomato DC3000 CEL has a nucleotide sequence (SEQ. ID. No. 14) as follows: 



atgcttaaaa aatgcctgct actggttata tcaatgtcac ttggcggctg ctggagcctg 60 
atgattcatc tggacggcga gcgttgcatc tatcccggca ctcgccaagg ttgggcgtgg 12 0 
ggaacccata acggagggca gagttggccc atacttatag acgtgccgtt ttccctcgcg 180 
ttggacacac tgctgctgcc ctacgacctc accgcttttc tgcccgaaaa tcttggcggt 24 0 
gatgaccgca aatgtcagtt cagtggagga ttgaacgtgc tcggttga 288 



The protein or polypeptide encoded by Pto DC3000 CEL ORF9 has an amino acid 
sequence (SEQ. ID. No. 15) as follows: 



Met Leu Lys Lys Cys Leu Leu Leu Val lie Ser Met Ser Leu Gly Gly 
15 10 15 

Cys Trp Ser Leu Met lie His Leu Asp Gly Glu Arg Cys lie Tyr Pro 

20 25 30 

Gly Thr Arg Gin Gly Trp Ala Trp Gly Thr His Asn Gly Gly Gin Ser 

35 40 45 

Trp Pro lie Leu lie Asp Val Pro Phe Ser Leu Ala Leu Asp Thr Leu 
50 55 60 

Leu Leu Pro Tyr Asp Leu Thr Ala Phe Leu Pro Glu Asn Leu Gly Gly 
65 70 75 80 

Asp Asp Arg Lys Cys Gin Phe Ser Gly Gly Leu Asn Val Leu Gly 

85 90 95 



The DNA molecule of ORF10 from the Pseudomonas syringae pv 
tomato DC3000 CEL has a nucleotide sequence (SEQ. ID. No. 16) as follows: 



atgaaacagg tagaagtcca gatcattact gaattgcctt gtcaggttct gatcctggag 60 
caagaggcag tagcagaggg cttcaggttt cttacccgct tgatcgagga gtggaggtcc 120 
ggaaagaatc gattcgaggc caagggtgaa tgcctcatgg tcgtacttct ggacggcgct 180 
ctggcaggta tcggaggcct ttcgcgtgat ccgcatgccc ggggtgatat gggcaggcta 240 
cgacggttat acgtcgcaag cgcatcaaga ggtcaaggcc ttggaaagac tctggtgaat 300 
cgacttgtgg agcatgcggc gcaggaattt ttcgccgtgc gcctgttcac tgatactccg 360 
agcggagcaa aattttactt acgttgcggc tttcaggcag ttgacgaggt gcatgccacg 420 
catataaagc ttttaaggcg ggtttga 447 



The protein or polypeptide encoded by Pto DC3000 CEL ORF10 has an amino acid 
sequence (SEQ. ID. No. 17) as follows: 



Met Lys Gin Val Glu Val Gin lie lie Thr Glu Leu Pro Cys Gin Val 
15 10 15 
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Leu lie Leu Glu 

20 

Arg Leu lie Glu 

35 

Gly Glu Cys Leu 
50 

Gly Gly Leu Ser 
65 

Arg Arg Leu Tyr 

Thr Leu Val Asn 

100 



Gin Glu Ala Val 

Glu Trp Arg Ser 

40 

Met Val Val Leu 

55 

Arg Asp Pro His 
70 

Val Ala Ser Ala 
85 

Arg Leu Val Glu 



Ala Glu Gly Phe 
25 

Gly Lys Asn Arg 



Leu Asp Gly Ala 

60 

Ala Arg Gly Asp 

75 

Ser Arg Gly Gin 
90 

His Ala Ala Gin 
105 



Arg Phe Leu Thr 
30 

Phe Glu Ala Lys 
45 

Leu Ala Gly lie 

Met Gly Arg Leu 

80 

Gly Leu Gly Lys 

95 

Glu Phe Phe Ala 
110 



Val Arg Leu Phe 

115 

Cys Gly Phe Gin 
130 



Thr Asp Thr Pro 

120 

Ala Val Asp Glu 

135 



Ser Gly Ala Lys 

Val His Ala Thr 

140 



Phe Tyr Leu Arg 
125 

His lie Lys Leu 



Leu Arg Arg Val 
145 



A DNA molecule which contains the EEL of Pseudomonas syringae 
pv. tomato DC3000 has a nucleotide sequence (SEQ. ID. No. 18) as follows: 



ggatccagcg 
tcgatgaaca 
acgccacggc 
ggtttgtctg 
ggctcgccca 
tcgagcgaac 
gcagtgtttt 
gagacacgtg 
acgtccgaac 
gcaccatcga 
aggctgaaca 
tcgtgcaggt 
gtgttgccca 
accgtgccgt 
gtcaggcgac 
gtgcgactca 
gggtgatgaa 
cgtaggccgc 
attcaggcaa 
agatgaagat 
ccagcgaacg 
ccgccacggc 
cttcgatctg 
aagcagtctc 
cggcagtcgg 
ggtcggcacc 
ccagcaacgg 
ccatctcggc 
tggcacgcac 
gcttgccgcc 
acaccatcaa 
gcgcgcccgt 



gcgtattgtc 
ggtggccgtt 
gcacaccttc 
ctggcataag 
ccgacagacc 
gcttgcgcag 
cgccgtgcgc 
ctacgtcttc 
ccaggtcacg 
ccggagaggc 
cctgaaaacc 
cgccgtggcc 
gaatcatctg 
aggtgcccac 
cgcgacgggc 
tgcgtgatcc 
catggcatca 
catggtttcg 
atgaaaatta 
gtcggtgtcg 
cacgctggtg 
atcgaccacg 
ctcgacacgc 
gacgcccttg 
cgccgccaca 
ttcgtccggg 
cagcacttct 
ctcgccgccg 
gtgcgccagc 
ggacgccttc 
gtcgcccgag 
cggcccatca 



gtggcgatgg 
gcgggcgttg 
gaccagatct 
gcccggcagg 
gccaatcgcc 
gttctcgtgc 
gaccttggag 
gtcggccggg 
ctggacctgc 
gaaggtcacg 
gccagagtcg 
cttgatgacc 
cgcaccggtg 
cggcatgaac 
cttgccgtcg 
tctggtgccg 
ccgtaactga 
ggataaccgg 
gtcaccaggg 
ccgctaaacg 
gtcccgaccg 
tcctggctga 
accggctgga 
gcggcaattg 
gcaccggcgc 
cggtctatat 
tcggcaaagc 
ccatcgatca 
acacgatggc 
tgcccgaaca 
cgcaaatgct 
agggtcaaca 



aacgcgttac 
cgggtcggca 
tcgggcttgc 
taatccagca 
aggccgtcaa 
atgccaccct 
cgcttggccc 
tacggcgtgc 
atcgactctt 
ccctcctcct 
gtcagaatcg 
tcggtgcccg 
gcctcgatat 
gccggggtct 
gtggccaaca 
attcctgtgg 
agaagcggta 
cgaacgccga 
catcgaccac 
gcttcaactg 
caatcacccg 
cttccagcca 
acgtacccgc 
cttccatcaa 
gctgggcgta 
aaggaggcaa 
gcaactcgaa 
ggatcgacga 
tgtccagcac 
aacgtgcggg 
cgagcaaatc 
gacgactgct 



ggattttcag 
tgacacaatc 
ctacacccat 
ccttgatcat 
agccgatctc 
gaacaatgcc 
agcgcaacga 
actcatcgaa 
ccgggcccat 
tgatcttgcg 
gccctttcca 
gacgcagcca 
cacgcggcaa 
cgaccacgcc 
actcgaaaga 
ggccgtcggc 
cccgtgttcg 
aaccagcatc 
atgaaacggc 
gccatcacgc 
cccgccccgc 
ttcgctgtgc 
gccgacgtgc 
cggctggtcg 
aacggtctga 
cggcatatgg 
cagcgcgtca 
gcccggcttt 
gcgctcgacc 
aatgacacgg 
ggtgaattga 
gcgacgctcg 



cacaccggta 60 
gaacatatca 120 
caagtaacga 180 
ctcgtgcttg 240 
atccaggcct 300 
gaacagcgcg 3 60 
cagctccatg 4 20 
aatcatcacg 480 
gaacaccttg 540 
catggcgccc 600 
ctgcatgaaa 660 
caagtggaag 720 
catgcccttg 780 
acgcggaaag 840 
catacgacag 900 
gcgggattgc 960 
atggccgccg 1020 
aacagcgtgg 1080 
cgccccggat 1140 
gcggcactct 1200 
gcacggcacg 1260 
atgtggtgat 1320 
agagtgacaa 1380 
aaatgcaggc 144 0 
taacgctcgc 1500 
ccgacacgat 1560 
tgccgcgcca 1620 
ggcgacttgc 1680 
agaatctcca 1740 
gtattgttga 1800 
cgatgtgcca 1860 
gccaacgggt 1920 
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gacgagcaat 
tcgtttagca 
atccctgttg 
ggaattggta 
ccctcggggc 
tggaaagtgg 
gcccaggact 
ctatgaacaa 
ttaaagtacc 
aagagaccca 
cgaaagacgt 
tattgcctat 
acaatagaac 
tcaccgaagg 
tgacctcgtg 
ggcaggcgta 
cacgaaataa 
gttgtcggtg 
tttgttgtcg 
gttgtccagg 
atccttgtcg 
gttgccgcca 
aaatgcaggt 
cgtcaggttt 
ggttactgaa 
ccgacagagg 
tcacacgact 
cagtgtcgga 
tttgttgcgt 
atgcctccgt 
aaggctacgc 
atcgagcaga 
caacaatccc 
tttcttttcg 
cgacgtgtgt 
cgcaggtggc 
tcaaccacag 
ctgacactta 
gttccgtaag 
ttgtaacgaa 
gcttcgacgt 
tcaggtacat 
cctgatacaa 
agcacagcat 
atctctgcct 
gccaacattg 
ccaaagcgtc 
cgagatgccg 
gtcatcggct 
gtagtgccca 
agcgatacgg 
gtgataaagg 
ggtctatacc 
aaaaagacct 
gcgatcttac 
aatgcttgtt 
agtctcggga 
tcaaggcaac 
ccaaaggcat 
agcagatatc 
atgagtcacg 
tacgtccggc 
attacagcga 
cagggttgag 
tggaaaagca 



cagggaatcg 
gggccgggaa 
accaacggaa 
gacgcggcgg 
accaccattg 
tctgactgag 
gccttccagc 
gatcgtctac 
tacaggcgaa 
gtggcagcaa 
cgaagacgca 
attgtccggg 
tgaactaagc 
cgtgacgctg 
ccacggacgc 
ctaacgtgca 
cacggtaggt 
ttgttgtcgt 
ttgtcgagat 
tccttgtcgt 
ttgccgccaa 
cacgtggcac 
agcgaagtgc 
ttatacgcgc 
cacgttcgat 
tcgaccaaac 
ctcctaccga 
tggtttgacc 
ggcatgctaa 
caaatagtgg 
acgaggacat 
acgcccccat 
tggcttttcc 
tgaagatgca 
cacatccagc 
tcaccacctg 
gcaaccctgg 
ccgcaccggg 
cccaatccgt 
cctgaacgag 
aatccagata 
tcgctgagcc 
ggtcgatcag 
ccagtttttt 
gggcaccctg 
caaaggctaa 
gtcggacctg 
cattggttag 
gggagcatca 
gagtgcagct 
attcgtttgc 
cctgatgcct 
ttttgcaagg 
tgagtttcaa 
cctcctctac 
tcgttatggg 
acctgattga 
gcttccctga 
ttgcagagag 
tttaagtttc 
cttatgtgtg 
ctatccgctg 
tccggcgatg 
tctggatcga 
tgagttggca 



gggagttcga 
gtttatccgg 
aactcatcct 
attcaaaatc 
agaaaagacc 
gctgcgatct 
gcagagcgtc 
gtaaaagctt 
attaaaaagg 
accgggtggt 
gtggcgcaac 
gcttatgatt 
ccaggagacc 
gtggcgaaaa 
cgctctgccc 
caagacctgc 
cgcgttgcta 
tatcaagatc 
ctttgtcgtt 
tacccccaaa 
atgccgcgtc 
cggtgctgtt 
caatgatcgt 
gcatcaggtt 
cagtgactaa 
tgcagcctgt 
tgctgggagt 
ggttttgggg 
tcgatacatt 
acgccagtca 
tgctgagatt 
gccagccacc 
gatacatagt 
tttcgcaaga 
ccgggaagcg 
actgtcgaca 
cagatagact 
gcttatctgc 
gaaaaagtgc 
attcctcaca 
agcaaaacaa 
caccaacatg 
ctgaccttta 
tgaggtgtag 
aatatcactt 
agcccatagg 
attgtggctc 
ctcaatcacg 
gttggcaatg 
gaccagcgtg 
ggcaggggcc 
cagtacgcca 
ttaacgaact 
ggtctttttt 
tcgggttggc 
catggcgtga 
gagccgctct 
ccttgagcac 
aggacagcaa 
ataacaacca 
gcgactcatc 
atggcgatgc 
gaggaagcac 
atcgccgatg 

ggcgggattt 



aggtaaagtc 

tttgacggca 

tatacttcgc 

cgttttcgaa 

ttgaaattca 

accccacctg 

ggtacccgga 

acttcaaacc 

gctttttcgg 

ctgattgtca 

tcaatgctga 

atgcgctcaa 

agtcctatgt 

aatttcagtc 

cctgatacga 

ccgtatcagc 

ctttttagcg 

gcggtcattt 

accgccaaac 

tgccgcgtcg 

agtcacgttg 

gtcgttgtcc 

cagcgcaagc 

ttcccggata 

aacagtatgt 

ttcataccca 

accaaaaaac 

agaattgctc 

tatcagtgtg 

cgttgcataa 

cggctgggca 

cgttaactca 

ccagaaaagg 

cagggccttt 

ggggtgtaaa 

aggcggctcg 

ttgcctttgg 

gcggtaatgt 

ttgcgattca 

aaatcctgct 

tccagacctc 

tttgagcggt 

ttcatataac 

gcatccagat 

cccggcgccg 

gtcgtctttt 

gcgatacgcg 

gcgcactatt 

cattcgcggt 

ccgccatcga 

atgcccgcta 

cctggcttac 

gtcatcaaaa 

cgtttggtga 

cgttagcacc 

tacaagcggt 

gcgctgtacc 

cacttagctg 

agctggccaa 

cctttgttga 

gaaatcggtt 

tgcggattca 

tttacgagac 

aaaccacgat 

tgcaggtcat 



agcgacgcgc 

ttagtaaaaa 

cgccattgag 

agaagtggga 

aggtcttttt 

cccggaattg 

tcacacgacc 

cattggggag 

cgacaaggaa 

gatagacggt 

cggttatgag 

ataccgatac 

cttcggctat 

gtctgcaagc 

aaacgccttc 

aagcgcaaga 

gcagacggcg 

ccaccgaaag 

gctgcatccg 

gtgtggtggt 

tcgttatcca 

agatcacaat 

agaaagccgc 

agtgaaaatg 

aactgcagcc 

tcaatttcta 

ttccgcactg 

aaacggagaa 

tgatgcggta 

aacctgacgt 

ttttcgctgt 

attgtctttt 

caaatccatc 

atccgtcacg 

tgccaatgta 

ggatatacgt 

ccctttcatt 

catccgccac 

aaaagtcaac 

gcgatgttga 

tgaagtcgat 

acggtgttcc 

ttttgttggt 

ttagtttaac 

gccccgaaac 

gcatctgatt 

agcaggctgc 

taccacgtgt 

ctcggcctca 

ggccgccgca 

ttgaatcggc 

aggcgggttg 

aacatggaag 

aaagtgatct 

caaagctacc 

aggcgtacag 

cccctggcct 

ggcgccacca 

tgcaatgaat 

tcagaattgt 

ccaatgcaag 

cctgatgcag 

aacgatcctg 

tctcaatttc 

caatggctat 



atgatcgggt 
acctgcgtaa 
ccctgatggc 
gttcgattct 
tttcgtctgg 
gccgcggagc 
aaggataacg 
gaagtctcgg 
atcatgaaaa 
gaacggcta^t 
attcaaacgg 
gaaatacgtc 
ggctacagct 
tgaataatag 
ctcaacaaga 
cgctcgcctc 
tgccgttgta 
ccgcatcggt 
tatggtgatc 
cattgtccat 
gatccttgtc 
cgtttacggc 
cgatctttgc 
atgaagcaag 
ttctgcaaga 
tagcgaccgt 
catttttttg 
cgatgagttt 
tggcagcttc 
cactccaaaa 
ttacacaggg 
gccctgaaaa 
acctttctgt 
ataaagaaac 
atcaccggtg 
catgctacgc 
aaggcgtttt 
agggtatgcc 
atcgccaccc 
tcttcgaaac 
gactaattgt 
taaaaacgct 
gcgggcttcc 

gggtgttttc 

cccacaccct 
caccgtaatt 
tccattcctt 
catcggttgc 
gcagacgctg 
gaggccgccc 
tgactggccc 
cattgcaata 
cacaatcaga 
gactcaaccc 
ttcctgcgcg 
caggtccatg 
gagccactgt 
tcggcatgca 
tttgttttag 
tgaagaaatc 
atgggatttt 
aactggtttg 
cgccagttcg 
cggcgcctgc 
ctgggtgatc 



1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 



-30- 

gaggtttgat gctgcgccaa ggtatggtgg tcgatgcgac gatcattcat gcgccgagct 5880 
cgaccaagaa caaggacggc aaacgcgatc ccgaaatgca tcagacgaag aaaggaaacc 594 0 
agtatttctt cggcatgaaa gcgcatatcg gcgtcgatgc cgagtcgggt ttagtccata 6000 
gcctggtggg tactgcggcg aatgtggcgg acgtgactca ggtcgatcaa ctgctgcaca 6060 
gtgaggaaac ctatgtcagc ggtgatgcgg gctacaccgg cgtggacaag cgtgcggagc 6120 
atcaggatcg ccagatgatc tggtcaattg cggcacgccc aagccgttat aaaaagcatg 6180 
gcgagaaaag tttgatcgca cgggtctatc gcaaaatcga gttcacgaaa gcccagttgc 6240 
999cgaaggt tgaacatccg cttcgcgtga tcaagcgcca gtttggttat acgaaagtcc 63 00 
ggtttcgcgg gctggctaaa aacaccgcgc aacaggctac tctgtttgcc ttgtcgaacc 6360 
tttggatggt gcgaaaacgg ctgctggcga tgggagaggt gcgcctgtaa tgcggaaaaa 6420 
cgccttggaa aggtgctgtt tgaaggaaaa tcgatgagtt aacagcgcaa aaacgtctga 6480 
ctatctgatc gggcgagttt ttttgaacct caggccatga aggcatcaaa aatcgatgct 6540 
tacttcagac cttccttaac ctcagtagcg aggccggata aacgagtccc tttctatgat 6600 
gctgtttcca gtaaactgac aaatttcatg cactgccgcc cgcgtgttca agcgctcaga 6660 
ccttatagga aagcctcacg tctggattca gcttgccgcc gtagtttttc acattgatat 6720 
cgacggtcgc tcgggacttg aggcccagat catcgatcac cagactgcgt accccatgca 6780 
actctgccaa ccctgggact ccgtcacagg aagtggcgtg cgttgccccg acaaaagcga 684 0 
cccacttacc ttccggtttg ctcagcctta ttttttctgc tgcgtagtaa ttcatggctt 6900 
gggcacgctt tatctcagct ttctccgggg ccatataggt ggacgttgta tccagcgaga 6 960 
caacgcgcaa cccggcgtgc ttggccgctt ccaccaaggt ggtgaagtta tatttcgtgt 7020 
ggagctcttc cggggcctga tgaccctgac tctgcaaatc gaggtagttt ttcagcctgg 7080 
caggcatcgg actgcctttg ggcgcgctca ggtaattatt gagcgccttg tcatgtgact 7140 
cggcgcagag gtgctccata aaaagcgtgg tcacgccact ggccttcaag ctcttcatgt 7200 
tattgatcag ttcacgcttg ctggacgttg aattgtgacc ctcaccaata acaagccccg 7260 
gcgcatcacg taacagctcg cgcatgacac cgagactgtc cttgcttttc atcttcgtca 7320 
acggcgccag ctcaggtaac ttttgcgcgt tgaaatcatc aaaataacgc gctgccttgg 7380 
caatcagttt cttgtcatta ctgtcaggtg cccataaacc cttggacgtc cccagacaac 7440 
tgtccatttc aaggtaattg agatttatat gaaggtggtc ccgaccttcc gagacaacaa 7500 
cgtcggccag cttgagacct tgagcctcaa ggcgctgttc aagggcgtgc ttgccttctt 7560 
gcaacaggat gctcacaaca tttgcagaca gttggctgct tttccccgct gcttttgagg 7620 
gtgccagcgc ataggggtgc gggctctcac accagcgcgc gagctcggca agatcgctcg 7680 
ccttgaagtt cgtatcctgc aatgctttgc tttgagctga agccgaggtc gaggccacgc 7740 
tctggccgcc gtgcacatga ctgctgcctg ctgcgtccgg cttacgcctt ctggtgtgct 7800 
ttacgccatc ctttccgcca ggctcctgcc cctcgatttt cagccggata ttttctacct 7860 
tcatatccgg atagcgcccg gctggaaagc gcttcaggtc ccccagcatt ggagtctctg 7920 
gcgcaacgct ggctgctgga gaggaactgg cctgtgaaga tcgggcgcga tcgtttcctg 7980 
cagcttgcgc agtgggacgc tcagcttcat aggttggcgg ataatagcct ggagccggtc 8 04 0 
caccgacggg tctcatgatt gaatctccgc gtacgaaaaa tagtgccgag cccgggcgtg 8100 
acgctgcccg ggccccgaca tttcagtcaa tcaatgcgcc ttcgcaatcc cgaactgatc 8160 
aagcaccgga tcaacgttat ggtcgaacgc cttctgcgcc ttatgctttt tcacagcatc 8220 
aatgatcatg gaaataccga aacctaccgc cagggcgcca tcgattgccc agccgaccac 8280 
tggaatcgcg gcgcctaggg cggcacctgc ggcaaggccg gtggcttcac cggcaaccat 8340 
gccgacggcg cgaccgatca tctgtccgcc cagacgccct aggccggctg aggcttcgcg 84 00 
gcccatcatc ttcgccccgg cgtcgatgcc acctttaatg gcctcggcgc ccatcctcgt 8460 
gctgtcgtaa atggcctggg ttgcgccaag cttgtcgcca tgagcgatca ggctggacac 8520 
tgaagcaaag cccacgatcg agttgagcgc cttgccgccg acgcccgcct cggcgagctg 8580 
agtcaacatg gacggtccgc cctcatcgct tttgccttcc agaagcttgc ggcctttttt 8640 
ggagtcttgc agcgtaccca acgtgctgtt catgtagttt tcatgctgat tttcggtgaa 8700 
atcagggggc agcacgctgt cgtaaatggc tttctggtta tcggcggttt gcagagactg 8760 
gctggcatca gactttttct ggccaagcag ctgcttcagt gcaccgcctt cgctgaagtt 8820 
ggtcacgtag gacgtggcaa tcttgtcttg cagatcgggt ttgttttcaa gcacctgatt 8880 
ggtagtgggt actttggaat cggggaacag gtctttttgc agttgcaact gggcggacaa 8940 
accgctgatg gcgccgctgt aatcggcatt cggattatgt ttgttgacgg ccttgtccgc 9000 
cttgtccata tcagtctgca gcgcttgacc gctattgacg tttttcgtct gctcgacgac 9060 
tgccttttgc agcgaggcat cactgcggac cagattgcgc tcctgctcgg gaatgctttt 9120 
attgaggtac gcttgtacgt caggatcagc ctgtagctgg gaaatccggt cgttcaaacc 9180 
ctgctcggtc ttgtcggtgt tgcgcaggct gcgcccggcg ataacgcttt gctgggtctg 9240 
ctgcaacttg accatgacgg ccgctttctg tgcaccgctg taagacttgg gtttgtcgaa 9300 
tacgtccttg tccagcttgc tgatatcaat cccggccacc gcattgagcg tcgcagaatc 9360 
gctgagcatg ctggcgaact ggccgccgtt ggtgggtgcg cttttcttga tccactcact 9420 
cagatttttc gcgtcgaaca tcttatcagg gctgtgcgca gccttcttgc gccccgacat 9480 
gcccgcttcg tctacctgac ccaaaaagcc tggttgcgac caggtgctgc aggactgttt 9540 
gagcgctccg gacaaccctg ggttactttg tgccaacccc ttcaggtctt ctgcgtcgac 9600 
attaccgtca actttggtct tgtccgctgc atccactgca tgatgtgggt cggcagcaat 9660 
cgccagtggc atattggctc gcatcactgc cgcgctgcgc accatttcca gtgactgcgg 9720 
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gtcagcgtcg 
cttttccata 
gccaccaccc 
catggccttg 
attgagctct 
ccaacgtttc 
ggctttgggg 
aaccggagcg 
gggttgcggc 
gagcaaaaga 
gttgccggcc 
caaggctttg 
cacgataacg 
tggcaactgt 
caggcgaacg 
acggtgccaa 
agcgacactt 
cgcgccagtt 
gaatgcactt 
ttgcggatca 
aacttgtgct 
tcagggttca 
atttggcgaa 
atcttttgaa 
gccnncttaa 
tttgtataca 
ctttcanggg 
tnaaanagng 
accctttcaa 



gggttgtcct 
ttttttgcga 
tgagcaacgt 
ccggcatcgg 
ttcgccgcct 
agagcttctg 
gcgttggaag 
gccggggcag 
tggacctgat 
gccaggatag 
ggctgaccga 
gcaagaggcg 
aagggagaac 
gtggttgaag 
ggtcgatttg 
tcggcacgcc 
ccagcacttt 
gccgatgccc 
cgtcttccca 
ggttcagcgc 
cgttacgcan 
tacccgcttt 
ataccgcgga 
acatgggctt 
aaaaanaaaa 
aaacttgggc 
antgatccng 
ttccnctata 
tgttgatttg 



tggtgtagtt 
aggtcttgag 
ccacggcggt 
ggccatcatc 
gctcgcgctc 
gcgaggagta 
cgtcggttgc 
tcgcttcagt 
tattcacatt 
acgacgcggt 
actgcacgcc 
actcaacagc 
tggatatacg 
gtgcaagttg 
ctgcttgagc 
gaggctgttc 
ttgcatgttc 
ctctacctga 
gctttcctga 
gatgccacac 
ggcttcaaga 
ttggataaac 
cnaaaatggc 
accctgatta 
aactggntga 
naccgntttt 
gaccgnaacc 
tggnaaaatt 
ncaaataagg 



ggccaagtcc 
atctttgttc 
cttcagcgcc 
acgcgccatc 
ttcgggcggc 
ttcagaattg 
atctgtgttc 
cggtgcagcc 
ggcattggca 
ctgctcggct 
ggcttgccca 
cagagccagt 
catggtgagt 
gttccagaaa 
tgaacccgcg 
gctgtttcct 
gacggcaggc 
tgactgacat 
tacggctgac 
agccaggtct 
aacacgcact 
gccctgagca 
tgacngggct 
atggngtaca 
tttatnaaaa 
gcccaaaact 
cttannggaa 
cgggggccca 
gattnnccca 



ttgtcggcac 
gtgatcttgc 
gggttggcgt 
catgccgctg 
agatgggcaa 
tcgagaaagg 
gtgggagctg 
tcggcaggag 
gctgccccgc 
cctgtcggcg 
ccgccaccca 
tcgccaggag 
tgccatccga 
aaatgatcga 
cgcgggacag 
gataattgcc 
aatcaatggc 
caccgtgccc 
gatacatttt 
gcggtttgct 
ggagaatgtc 
tctgaatctg 
gggttgagtc 
aaccctatag 
aattttaaaa 
tttgggcaaa 
taatccggtt 
cccnttngaa 
aaaggtttng 



tgtctgcggc 
catctgcgtt 
tgatgaaatc 
caatcgggcg 
ccatcggctc 
ctgcgtctgc 
cgacctgttc 
aatctgcgca 
cactgccctg 
cgccttgcgt 
caggtgtcgg 
tgggttggtt 
gagtgagcga 
gatcgccatt 
gcgtgagcga 
gtccatctcc 
ctgaatgact 
ttccagctcg 
gcggaagtga 
ggcatgttga 
atccacatca 
atcgggcggc 
nangatcaca 
cgataaccat 
anngaaattt 
aanatnggan 
aaancggcta 
ccttttggna 
ctttnggg 



9780 

9840 

9900 

9960 

10020 

10080 

10140 

10200 

10260 

10320 

10380 

10440 

10500 

10560 

10620 

10680 

10740 

10800 

10860 

10920 

10980 

11040 

11100 

11160 

11220 

11280 

11340 

11400 

11458 



Several undefined nucleotides exist in SEQ. ID. No. 18, however these appear to be 
present in intergenic regions. The EEL of Pseudomonas syringae pv. tomato DC3000 
contains a number of ORFs. One of the products encoded by the EEL is a homolog of 
TnpA' from P. stutzeri. An additional four products are produced by ORF1-4, 
respectively. The nucleotide sequences for a number of these ORFs and their encoded 
protein or polypeptide products are provided below. 

The DNA molecule of ORF1 from the Pseudomonas syringae pv. 



tomato DC3000 EEL has a nucleotide sequence (SEQ. ID. No. 19) as follows: 



atgagacccg 
cccactgcgc 
gcagccagcg 
cgctatccgg 
ggaaaggatg 
gtgcacggcg 
gatacgaact 
ccctatgcgc 
gtgagcatcc 
ctcaagctgg 
taccttgaaa 
gacaagaaac 
cctgagctgg 
ctgttacgtg 
cgtgaactga 
gagcacctct 
ggcagtccga 



tcggtggacc 
aagctgcagg 
ttgcgccaga 
atatgaaggt 
gcgtaaagca 
gccagagcgt 
tcaaggcgag 
tggcaccctc 
tgttgcaaga 
ccgacgttgt 
tggacagttg 
tgattgccaa 
cgccgttgac 
atgcgccggg 
tcaataacat 
gcgccgagtc 
tgcctgccag 



ggctccaggc 
aaacgatcgc 
gactccaatg 
agaaaatatc 
caccagaagg 
ggcctcgacc 
cgatcttgcc 
aaaagcagcg 
aggcaagcac 
tgtctcggaa 
tctggggacg 
ggcagcgcgt 
gaagatgaaa 
gcttgttatt 
gaagagcttg 
acatgacaag 
gctgaaaaac 



tattatccgc 
gcccgatctt 
ctgggggacc 
cggctgaaaa 
cgtaagccgg 
tcggcttcag 
gagctcgcgc 
gggaaaagca 
gcccttgaac 
ggtcgggacc 
tccaagggtt 
tattttgatg 
agcaaggaca 
ggtgagggtc 
aaggccagtg 
gcgctcaata 
tacctcgatt 



caacctatga 
cacaggccag 
tgaagcgctt 
tcgaggggca 
acgcagcagg 
ctcaaagcaa 
gctggtgtga 
gccaactgtc 
agcgccttga 
accttcatat 
tatgggcacc 
atttcaacgc 
gtctcggtgt 
acaattcaac 
gcgtgaccac 
attacctgag 
tgcagagtca 



agctgagcgt 60 
ttcctctcca 120 
tccagccggg 180 
ggagcctggc 240 
cagcagtcat 300 
agcattgcag 360 
gagcccgcac 420 
tgcaaatgtt 480 
ggctcaaggt 54 0 
aaatctcaat 600 
tgacagtaat 660 
gcaaaagtta 720 
catgcgcgag 780 
gtccagcaag 84 0 
gctttttatg 900 
cgcgcccaaa 960 
gggtcatcag 1020 



-32- 



gccccggaag 
gccgggttgc 
gagataaagc 
ccggaaggta 
ccagggttgg 
tcccgagcga 
aggctttcct 



agctccacac 
gcgttgtctc 
gtgcccaagc 
agtgggtcgc 
cagagttgca 
ccgtcgatat 
ataaggtctg 



gaaatataac 
gctggataca 
catgaattac 
ttttgtcggg 

tggggtacgc 

caatgtgaaa 
a 



ttcaccacct 
acgtccacct 
tacgcagcag 
gcaacgcacg 
agtctggtga 
aactacggcg 



tggtggaagc 
atatggcccc 
aaaaaataag 
ccacttcctg 
tcgatgatct 
gcaagctgaa 



ggccaagcac 
ggagaaagct 
gctgagcaaa 
tgacggagtc 
gggcctcaag 
tccagacgtg 



1080 
1140 
1200 
1260 
1320 
1380 
1401 



The protein or polypeptide encoded by Pto DC3000 EEL ORF1 has an amino ac 



sequence (SEQ. ID. No. 20) as follows: 



Met Arg Pro Val Gly Gly Pro Ala Pro Gly Tyr Tyr Pro Pro Thr Tyr 
15 10 15 

Glu Ala Glu Arg Pro Thr Ala Gin Ala Ala Gly Asn Asp Arg Ala Arg 

20 25 30 

Ser Ser Gin Ala Ser Ser Ser Pro Ala Ala Ser Val Ala Pro Glu Thr 

35 40 45 

Pro Met Leu Gly Asp Leu Lys Arg Phe Pro Ala Gly Arg Tyr Pro Asp 
50 55 60 

Met Lys Val Glu Asn lie Arg Leu Lys lie Glu Gly Gin Glu Pro Gly 
65 70 75 80 

Gly Lys Asp Gly Val Lys His Thr Arg Arg Arg Lys Pro Asp Ala Ala 

85 90 95 

Gly Ser Ser His Val His Gly Gly Gin Ser Val Ala Ser Thr Ser Ala 

100 105 110 

Ser Ala Gin Ser Lys Ala Leu Gin Asp Thr Asn Phe Lys Ala Ser Asp 

115 120 125 

Leu Ala Glu Leu Ala Arg Trp Cys Glu Ser Pro His Pro Tyr Ala Leu 
130 135 140 

Ala Pro Ser Lys Ala Ala Gly Lys Ser Ser Gin Leu Ser Ala Asn Val 

145 150 155 160 

Val Ser lie Leu Leu Gin Glu Gly Lys His Ala Leu Glu Gin Arg Leu 

165 170 175 



Glu Ala Gin Gly 

180 

Asp His Leu His 

195 

Gly Thr Ser Lys 
210 

lie Ala Lys Ala 
225 

Pro Glu Leu Ala 



Val Met Arg Glu 

260 



Leu Lys Leu Ala 



lie Asn Leu Asn 

200 

Gly Leu Trp Ala 

215 

Ala Arg Tyr Phe 
230 

Pro Leu Thr Lys 
245 

Leu Leu Arg Asp 



Asp val val val 
185 

Tyr Leu Glu Met 



Pro Asp Ser Asn 

220 

Asp Asp Phe Asn 

235 

Met Lys Ser Lys 
250 

Ala Pro Gly Leu 
265 



Ser Glu Gly Arg 
190 

Asp Ser Cys Leu 
205 

Asp Lys Lys Leu 



Ala Gin Lys Leu 

240 

Asp Ser Leu Gly 

255 

Val lie Gly Glu 
270 



-33- 



Gly His Asn Ser Thr Ser Ser Lys Arg Glu Leu lie Asn Asn Met Lys 

275 280 285 

Ser Leu Lys Ala Ser Gly Val Thr Thr Leu Phe Met Glu His Leu Cys 
290 295 300 



10 



Ala Glu Ser His Asp Lys Ala Leu Asn Asn Tyr Leu Ser Ala Pro Lys 
305 310 315 320 

Gly Ser Pro Met Pro Ala Arg Leu Lys Asn Tyr Leu Asp Leu Gin Ser 

325 330 335 



15 



Gin Gly His Gin Ala Pro Glu Glu Leu His Thr Lys Tyr Asn Phe Thr 

340 345 350 

Thr Leu Val Glu Ala Ala Lys His Ala Gly Leu Arg Val Val Ser Leu 

355 360 365 



20 



Asp Thr Thr Ser Thr Tyr Met Ala Pro Glu Lys Ala Glu lie Lys Arg 
370 375 380 



25 



30 



Ala Gin Ala Met Asn Tyr Tyr Ala Ala Glu Lys lie Arg Leu Ser Lys 
385 390 395 400 

Pro Glu Gly Lys Trp Val Ala Phe Val Gly Ala Thr His Ala Thr Ser 

405 410 415 

Cys Asp Gly Val Pro Gly Leu Ala Glu Leu His Gly Val Arg Ser Leu 

420 425 430 

Val lie Asp Asp Leu Gly Leu Lys Ser Arg Ala Thr Val Asp lie Asn 

435 440 445 



35 



Val Lys Asn Tyr Gly Gly Lys Leu Asn Pro Asp Val Arg Leu Ser Tyr 
450 455 460 



40 



Lys Val 
465 



The DNA molecule of ORF2 from the Pseudomonas syringae pv 
tomato DC3000 EEL has a nucleotide sequence (SEQ. ID. No. 21) as follows: 



45 



50 



55 



atgcaaaaga 
gggccggcgc 
ctaaatctgg 
aaaagttata 
ccgtaccgct 
ttcagaggtc 
tcgcagcagg 
tttttgaatc 
gatgacatta 
agggccaaag 
atcccgagcc 
ggcatttaca 
ggataa 



cgaccctatg 
cgggaagtga 
atgcctacac 
tgaataaagg 
caaacatgtt 
tggattgttt 
attttgtgag 
gcaagcactt 
ccgcgcagat 
gcaaagtcta 
gccttgtcga 
cccccgcttc 



ggctttagcc 
tattcagggt 
ctcaaaaaaa 
tcagctgatc 

ggtgggctca 

tgcttatctg 
gaatctcgtt 
tttcacggat 
aagccccggt 
tctgccaggg 
cagtcaggtg 
ccgggctgga 



tttgcaatgt 
gcccaggcag 
ctggatgctg 
gaccttgtat 
gcgaatgtac 
gattacgtcg 
caggttcgtt 
tgggcttacg 
gcggtaagtg 
ttgcctgtgg 
gtgagccacc 
tgtgacacac 



tggcagggtg 
agatgaaaac 
tgctggaagc 
caggagcgtt 
ctgaacaatt 
aagcgtttcg 
acaagggtgg 
gaacggcata 
tcagaaaacg 
ttgagcgtag 
tgcgcaccgg 
gtcggtttct 



tggggtttcg eo 

acccgttaaa 120 
ccgcaccaac 180 
tttaggaaca 240 
agtcatcgac 300 
aagatcaaca 360 
cgatgttgac 420 
ccctgtggcg 480 
ccttaatgaa 540 
catgacgtat 600 
tgattacatt 660 
ttatcgtgac 720 

726 



The protein or polypeptide encoded by Pto DC3000 EEL ORF2 has an amino acid 



60 sequence (SEQ. ID. No. 22) as follows: 



-34- 

Met Gin Lys Thr Thr Leu Trp Ala Leu Ala Phe Ala Met Leu Ala Gly 
15 10 15 

Cys Gly Val Ser Gly Pro Ala Pro Gly Ser Asp lie Gin Gly Ala Gin 

20 25 30 

Ala Glu Met Lys Thr Pro Val Lys Leu Asn Leu Asp Ala Tyr Thr Ser 

35 40 45 

Lys Lys Leu Asp Ala Val Leu Glu Ala Arg Thr Asn Lys Ser Tyr Met' 
50 55 60 

Asn Lys Gly Gin Leu lie Asp Leu Val Ser Gly Ala Phe Leu Gly Thr 
65 70 75 80 

Pro Tyr Arg Ser Asn Met Leu Val Gly Ser Ala Asn Val Pro Glu Gin 

85 90 95 

Leu Val lie Asp Phe Arg Gly Leu Asp Cys Phe Ala Tyr Leu Asp Tyr 

100 105 110 

Val Glu Ala Phe Arg Arg Ser Thr Ser Gin Gin Asp Phe Val Arg Asn 

115 120 125 

Leu Val Gin Val Arg Tyr Lys Gly Gly Asp Val Asp Phe Leu Asn Arg 
130 135 140 

Lys His Phe Phe Thr Asp Trp Ala Tyr Gly Thr Ala Tyr Pro Val Ala 
145 150 155 160 

Asp Asp lie Thr Ala Gin lie Ser Pro Gly Ala Val Ser Val Arg Lys 

165 170 175 

Arg Leu Asn Glu Arg Ala Lys Gly Lys Val Tyr Leu Pro Gly Leu Pro 

180 185 190 

Val Val Glu Arg Ser Met Thr Tyr lie Pro Ser Arg Leu Val Asp Ser 

195 200 205 

Gin Val Val Ser .His Leu Arg Thr Gly Asp Tyr lie Gly lie Tyr Thr 
210 215 220 

Pro Ala Ser Arg Ala Gly Cys Asp Thr Arg Arg Phe Leu Tyr Arg Asp 
225 230 235 240 



Gly 



The DNA molecule of ORF3 from the Pseudomonas syringae pv 



tomato DC3000 EEL has a nucleotide sequence (SEQ. ID. No. 23) as follows: 



atgcgcgcgt ataaaaacct gacggcaaag 
attggcactt cgctacctgc atttgccgta 
accggtgcca cgtgtggcgg caacgacaag 
gcatttggcg gcaacgacaa ggatatggac 
ggtaacgaca aggacctgga caacgatcac 
aaagatctcg acaacgacaa caaaaccgat 
gataacgaca acaacaccga caactacaac 



atcggcggct ttctgcttgc gctgacgatc 60 
aacgattgtg atctggacaa cgacaacagc 120 
gatctggata acgacaacgt gactgacgcg 180 
aatgaccacc acaccgacgc ggcatttggg 24 0 
catacggatg cagcgtttgg cggtaacgac 300 
gcggctttcg gtggaaatga ccgcgatctt 360 
ggcacgccgt ctgccgctaa aaagtag 417 



The protein or polypeptide encoded by Pto DC3000 EEL ORF3 has an amino acid 
sequence (SEQ. ID. No. 24) as follows: 



-35- 



Met Arg Ala Tyr Lys Asn Leu Thr Ala Lys lie Gly Gly Phe Leu Leu 
15 10 15 

Ala Leu Thr lie lie Gly Thr Ser Leu Pro Ala Phe Ala Val Asn Asp 

20 25 30 

Cys Asp Leu Asp Asn Asp Asn Ser Thr Gly Ala Thr Cys Gly Gly Asn 

35 40 45 

Asp Lys Asp Leu Asp Asn Asp Asn Val Thr Asp Ala Ala Phe Gly Gly 
50 55 60 

Asn Asp Lys Asp Met Asp Asn Asp His His Thr Asp Ala Ala Phe Gly 
65 70 75 80 

Gly Asn Asp Lys Asp Leu Asp Asn Asp His His Thr Asp Ala Ala Phe 

85 90 95 

Gly Gly Asn Asp Lys Asp Leu Asp Asn Asp Asn Lys Thr Asp Ala Ala 

100 105 110 

Phe Gly Gly Asn Asp Arg Asp Leu Asp Asn Asp Asn Asn Thr Asp Asn 

115 120 125 

Tyr Asn Gly Thr Pro Ser Ala Ala Lys Lys 
130 135 



P. s. syringae pv. tomato DC3000 EEL ORF3 has now been shown to significantly reduce 
virulence when mutated. Perhaps more interestingly, overexpression strongly increases lesion 
size. Hence, this effector is biologically active and appears to have a key role in symptom 



production. 



The DNA molecule of ORF4 from the Pseudomonas syringae pv. 



tomato DC3000 EEL has a nucleotide sequence (SEQ. ED. No. 25) as follows: 



atgaacaaga 
aaagtaccta 
gagacccagt 
aaagacgtcg 
ttgcctatat 
aatagaactg 
accgaaggcg 



tcgtctacgt 
caggcgaaat 
ggcagcaaac 
aagacgcagt 
tgtccggggc 
aactaagccc 
tgacgctggt 



aaaagcttac 
taaaaagggc 
cgggtggtct 
ggcgcaactc 
ttatgattat 
aggagaccag 
ggcgaaaaaa 



ttcaaaccca 
tttttcggcg 
gattgtcaga 
aatgctgacg 
gcgctcaaat 
tcctatgtct 
tttcagtcgt 



ttggggagga agtctcggtt 60 
acaaggaaat catgaaaaaa 120 
tagacggtga acggctatcg 180 
gttatgagat tcaaacggta 240 
accgatacga aatacgtcac 3 00 
tcggctatgg ctacagcttc 360 
ctgcaagctg a 411 



The protein or polypeptide encoded by Pto DC3000 EEL ORF4 has an amino acid 
sequence (SEQ. ED. No. 26) as follows: 



Met Asn Lys lie Val Tyr Val Lys Ala Tyr Phe Lys Pro He Gly Glu 
15 10 15 



Glu Val Ser Val Lys Val Pro Thr Gly Glu He Lys Lys Gly Phe Phe 

20 25 30 



Gly Asp Lys Glu He Met Lys Lys Glu Thr Gin Trp Gin Gin Thr Gly 

35 40 45 



36 



Trp Ser Asp Cys Gin lie Asp Gly Glu Arg Leu Ser Lys Asp Val Glu 
50 55 60 

Asp Ala Val Ala Gin Leu Asn Ala Asp Gly Tyr Glu lie Gin Thr Val 

65 70 75 80 



10 



Leu Pro lie Leu Ser Gly Ala Tyr Asp Tyr Ala Leu Lys Tyr Arg Tyr 

85 90 95 

Glu lie Arg His Asn Arg Thr Glu Leu Ser Pro Gly Asp Gin Ser Tyr 

100 105 110 



15 



Val Phe Gly Tyr Gly Tyr Ser Phe Thr Glu Gly Val Thr Leu Val Ala 

115 120 125 

Lys Lys Phe Gin Ser Ser Ala Ser 
130 135 



t ~ a 



I— — 



t: 

t 
i 



20 The EEL of Pseudomonas syringae pv. syringae B728a contains a 

number of ORFs. Two of the open reading frames appear to be mobile genetic 
elements without comparable homologs in EELs of other Pseudomonas syringae 
variants. An additional four products are produced by ORF1-2 and ORF5-6, 
respectively. The nucleotide sequences for a number of these ORFs and their encoded 

25 protein or polypeptide products are provided below. 

The DNA molecule of ORF1 from the Pseudomonas syringae pv. 
syringae B728a EEL has a nucleotide sequence (SEQ. ID. No. 27) as follows: 



30 



35 



40 



45 



atgggttgcg 
acaaactctc 
cttcaggggc 
cgatggccta 
tcgttctacc 
tttcaggagc 
cggcttttta 
ggaccatacg 
atgatggatt 
cgacttcctc 
ccttaccaaa 
cacacccaag 
gctacggatg 
tggacggtac 
cgctccattg 
gatctggaag 
gaacataact 



tatcgtcaaa 
cagaggcatc 
cccaagtgag 
atccgcattt 
ataaaagccg 
tctggagtga 
gttcatcgcg 
aatttttaaa 
ttctcccaca 
tcacctggat 
ggttgcgcga 
ccgagtatgt 
ctgcattgtc 
aagctgttcc 
cccaggcaag 
cgcttacgat 
ga 



agcatctgtc 
ctcagtccat 
cagattgatg 
taacagggac 
agagcttggt 
agctcgtgat 
tgatcccaac 
agatagattc 
cagcaatacg 
ctcgataagt 
ccaaggcatg 
gcccaaaatt 
cgatgccaat 
cgactttcgt 
gggcatggac 
gcctttgaaa 



atttcttcgg 
caacgagcca 
ccttaccagc 
gatgcgcccc 
gcgtcggtcg 
tggagagctt 
tcttcacggg 
gcaaaccgta 
tttaggtttc 
tctgatcgtc 
aacgatgtgg 
atgcaacatg 
gcgctgaaaa 
ggaagtgcag 
ctgccgccga 
gactttgtga 



acagctttcg 
ggacgccaag 
aggcgttagt 
accagatgga 
ccaatggaga 
ccagagcagg 
cgtttgttac 
aagatggaga 
atgggaaaat 
gtgccgacag 
gtgagcctaa 
tggagcatct 
aactcgcaga 
ctaaggctga 
tgagactcgg 
aaagttacga 



cgcatcatat 60 
gtgcggtgag 120 
aggtgtggcc 180 
gtatggagaa 240 
gatagaaacg 3 00 
ccaagatgct 360 
gcctataact 420 
aaagcataag 4 80 
tgacggtgag 54 0 
aacaaaggat 600 
tgtgatgttg 660 
ttataaggcc 720 
gatacattgg 780 
gctctgcgtg 840 
catcgtgccg 900 
agggttcttc 960 

972 



50 



The protein or polypeptide encoded by Psy B728a EEL ORF1 has an amino acid 
sequence (SEQ. ID. No. 28) as follows: 



Met Gly Cys Val Ser Ser Lys Ala Ser Val lie Ser Ser Asp Ser Phe 
15 10 15 



55 



Arg Ala Ser Tyr Thr Asn Ser Pro Glu Ala Ser Ser Val His Gin Arg 

20 25 30 



37 



Ala Arg Thr Pro 

35 

Leu Met Pro Tyr 
50 

Pro His Phe Asn 
65 

Ser Phe Tyr His 



Glu lie Glu Thr 

100 

Ala Ser Arg Ala 

115 

Pro Asn Ser Ser 
130 

Phe Leu Lys Asp 
145 

Met Met Asp Phe 



lie Asp Gly Glu 

180 

Arg Arg Ala Asp 

195 

Gly Met Asn Asp 
210 

Glu Tyr Val Pro 
225 

Ala Thr Asp Ala 



Glu lie His Trp 

260 

Ala Ala Lys Ala 

275 

Met Asp Leu Pro 

290 

Leu Thr Met Pro 
305 

Glu His Asn 



Arg Cys Gly Glu 

40 

Gin Gin Ala Leu 

55 

Arg Asp Asp Ala 
70 

Lys Ser Arg Glu 
85 

Phe Gin Glu Leu 



Gly Gin Asp Ala 

120 

Arg Ala Phe Val 

135 

Arg Phe Ala Asn 
150 

Leu Pro His Ser 
165 

Arg Leu Pro Leu 



Arg Thr Lys Asp 

200 

Val Gly Glu Pro 

215 

Lys lie Met Gin 
230 

Ala Leu Ser Asp 

245 

Trp Thr Val Gin 



Glu Leu Cys Val 

280 

Pro Met Arg Leu 

295 

Leu Lys Asp Phe 
310 



Leu Gin Gly Pro 



Val Gly Val Ala 

60 

Pro His Gin Met 

75 

Leu Gly Ala Ser 
90 

Trp Ser Glu Ala 
105 

Arg Leu Phe Ser 



Thr Pro lie Thr 

140 

Arg Lys Asp Gly 

155 

Asn Thr Phe Arg 
170 

Thr Trp lie Ser 
185 

Pro Tyr Gin Arg 



Asn Val Met Leu 

220 

His Val Glu His 

235 

Ala Asn Ala Leu 
250 

Ala Val Pro Asp 
265 

Arg Ser lie Ala 



Gly lie Val Pro 

300 

Val Lys Ser Tyr 

315 



Gin Val Ser Arg 
45 

Arg Trp Pro Asn 



Glu Tyr Gly Glu 

80 

Val Ala Asn Gly 

95 

Arg Asp Trp Arg 
110 

Ser Ser Arg Asp 
125 

Gly Pro Tyr Glu 



Glu Lys His Lys 

160 

Phe His Gly Lys 

175 

lie Ser Ser Asp 
190 

Leu Arg Asp Gin 
205 

His Thr Gin Ala 



Leu Tyr Lys Ala 

240 

Lys Lys Leu Ala 

255 

Phe Arg Gly Ser 
270 

Gin Ala Arg Gly 
285 

Asp Leu Glu Ala 



Glu Gly Phe Phe 

320 



As indicated in Table 1 (see Example 2), the DNA molecule encoding this protein or 
polypeptide bears significant homology to the nucleotide sequence from 
Pseudomonas syringae pv. phaseolicola which encodes AvrPphC. 



-38- 

The DNA molecule of ORF2 from the Pseudomonas syringae pv 
syringae B728a EEL has a nucleotide sequence (SEQ. ID. No. 29) as follows: 



atgagaattc 
gaaaaggccg 
gaacatcctg 
acacggttac 
cctggctacc 
gggcttattc 
ggcaacattg 
gcaagacgcc 
gacatgaact 
ggcgaacatg 
gctggcgatg 
acggatgatt 
gccgtttttg 
tcgttcacgc 
gcgctgaccc 
tcgccggttg 
cgacgagtca 
gaggcgtccg 
gcgccaaaag 
gcaacctga 



acagttccgg 
tgcaatcatc 
aatcccgctc 
cccctgttgc 
tgctgttacg 
ctgctgatga 
atgtggatgc 
tgagaaaaga 
ggcatgtgct 
cccgtatagc 
aaaatattca 
ccagcgctgg 
cagaggacag 
tttcaaccgc 
aagcgaccag 
aaggtggtcg 
gtgacatgtt 
gagttgcaat 
tggtcaggca 



tcatggcatc 
ggcccaagcg 
ctgtcaggca 
gtctgcaggg 
tcggcttgat 
agcagtgggc 
gcaacgctcc 
cgccgagacg 
ggttgccatg 
gagctttgcc 
tctggctgcg 
ctcttcgcct 
tcggtttgct 
tgccaaagca 
ccgtttgcag 
ctatcggcaa 
gaacaatgcc 
gtcgctgggt 
agccagaggc 



tccggaccag 
cagaatgaag 
cgcccgaact 
cagtcgctgt 
cgtcgtccgc 
gaagcgcgcc 
aacctggaaa 
gcgggtcatg 
tcgggtcagg 
tacggtgcat 
cagagcgggg 
attgtcatgg 
aaagataggc 
ggcaagatta 
caacgtcttg 
gaaaactcgg 
gatccacggc 
gcccaaggcg 
gtcgcatctg 



tatcctctgc 
cgtctcacag 
acccttattc 
ctgagacacc 
tggaccagga 
gcgcgttgcc 
gcggggcccg 
agccgatgcc 
tgttcggggc 
cggctcagga 
aagatcatgt 
acccctggtc 
gcgcggtaga 
cacgagagac 
ctgatcagca 
tgcttgatga 
gtgcattgca 
tcaagacggt 
ctaaaggtat 



agaaaccgtt 60 
cggtccatca 120 
gtcagtcaaa 180 
ctcttcattg 240 
cgcaataaag 300 
cttcggcagg 3 60 
cacgctcgcc 420 
cgagaacgaa 480 
tggcaactgt 540 
aaaaggacgc 600 
ctgggctgaa 660 
aaacggtcct 720 
gcgaacggat 7 80 
agccgagaag 84 0 
ggcgcaagtc 900 
tgcgttcgcc 960 
ggtggaaatc 1020 
cgtccgacag 1080 
gtctccgcga 1140 

1149 



The protein or polypeptide encoded by Psy B728a EEL ORF2 has an amino acid 
sequence (SEQ. ID. No. 30) as follows: 



Met Arg lie His Ser Ser Gly His Gly lie Ser Gly Pro Val Ser Ser 
15 10 15 

Ala Glu Thr Val Glu Lys Ala Val Gin Ser Ser Ala Gin Ala Gin Asn 

20 25 30 

Glu Ala Ser His Ser Gly Pro Ser Glu His Pro Glu Ser Arg Ser Cys 

35 40 45 

Gin Ala Arg Pro Asn Tyr Pro Tyr Ser Ser Val Lys Thr Arg Leu Pro 
50 55 60 

Pro Val Ala Ser Ala Gly Gin Ser Leu Ser Glu Thr Pro Ser Ser Leu 

65 70 75 80 

Pro Gly Tyr Leu Leu Leu Arg Arg Leu Asp Arg Arg Pro Leu Asp Gin 

85 90 95 

Asp Ala lie Lys Gly Leu lie Pro Ala Asp Glu Ala Val Gly Glu Ala 

100 105 110 

Arg Arg Ala Leu Pro Phe Gly Arg Gly Asn lie Asp Val Asp Ala Gin 

115 120 125 

Arg Ser Asn Leu Glu Ser Gly Ala Arg Thr Leu Ala Ala Arg Arg Leu 
130 135 140 



Arg Lys Asp Ala Glu Thr Ala Gly His Glu Pro Met Pro Glu Asn Glu 
145 150 155 160 



Asp Met Asn Trp His Val Leu Val Ala Met Ser Gly Gin Val Phe Gly 

165 170 175 



-39- 

Ala Gly Asn Cys Gly Glu His Ala Arg lie Ala Ser Phe Ala Tyr Gly 

180 185 190 

Ala Ser Ala Gin Glu Lys Gly Arg Ala Gly Asp Glu Asn lie His Leu 
5 195 200 205 

Ala Ala Gin Ser Gly Glu Asp His Val Trp Ala Glu Thr Asp Asp Ser 
210 215 220 

10 Ser Ala Gly Ser Ser Pro lie Val Met Asp Pro Trp Ser Asn Gly Pro 
225 230 235 240 



15 



30 



40 



Ala Val Phe Ala Glu Asp Ser Arg Phe Ala Lys Asp Arg Arg Ala Val 

245 250 255 

Glu Arg Thr Asp Ser Phe Thr Leu Ser Thr Ala Ala Lys Ala Gly Lys 

260 265 270 



lie Thr Arg Glu Thr Ala Glu Lys Ala Leu Thr Gin Ala Thr Ser Arg 

O 20 275 280 285 



Leu Gin Gin Arg Leu Ala Asp Gin Gin Ala Gin Val Ser Pro Val Glu 
290 295 300 

25 Gly Gly Arg Tyr Arg Gin Glu Asn Ser Val Leu Asp Asp Ala Phe Ala 
305 310 315 320 



Arg Arg Val Ser Asp Met Leu Asn Asn Ala Asp Pro Arg Arg Ala Leu 

325 330 335 

Gin Val Glu lie Glu Ala Ser Gly Val Ala Met Ser Leu Gly Ala Gin 

340 345 350 



Gly Val Lys Thr Val Val Arg Gin Ala Pro Lys Val Val Arg Gin Ala 
35 355 360 365 



Arg Gly Val Ala Ser Ala Lys Gly Met Ser Pro Arg Ala Thr 

370 375 380 



As indicated in Table 1 (see Example 2), the DNA molecule encoding this protein or 
polypeptide bears significant homology to the nucleotide sequence from 
Pseudomonas syringae pv. phaseolicola which encodes AvrPphE. 

The DNA molecule of ORF5 from the Pseudomonas syringae pv. 
45 syringae B728a EEL has a nucleotide sequence (SEQ. ID. No. 31) as follows: 



atgaatatct caggtccgaa cagacgtcag gggactcagg cagagaacac tgaaagcgct 60 
tcgtcatcat cggtaactaa cccaccgcta cagcgtggcg agggcagacg tctgcgacgt 120 
caggatgcgc tgccaacgga tatcagatac aacgccaacc agacagcgac atcaccgcaa 180 

50 aacgcgcgcg cggcaggaag atatgaatca ggggccagct catccggcgc gaatgatact 24 0 
ccgcaggctg aaggttcaat gccttcgtcg tccgcccttt tacaatttcg cctcgccggc 300 
gggcggaacc attctgagct ggaaaatttt catactatga tgctgaactc accgaaagca 360 
tcacggggag atgctatacc tgagaagccc gaagcaatac ctaagcgcct actggagaag 420 
atggaaccga ttaacctggc ccagttagct ttgcgtgata aggatctgca tgaatatgcc 480 

55 gtaatggtct gtaaccaagt gaaaaagggt gaaggtccga actccaatat tacgcaagga 540 
gatatcaagt tactgccgct gttcgccaaa gcggaaaata caagaaatcc cggcttgaat 600 
ctgcatacat tcaaaagtca taaagactgt taccaggcga taaaagagca aaacagggat 660 
attcaaaaaa acaagcaatc gctgagtatg cgggttgttt accccccatt caaaaagatg 720 
ccagaccacc atatagcctt ggatatccaa ctgagatacg gccatcgacc gtcgattgtc 780 

60 ggctttgagt ctgcccctgg gaacattata gatgctgcag aaagggaaat actttcagca 840 
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ttaggcaacg 
accatgtttg 
cgtctgcaca 
tcaaaaagct 
ggtctgcata 
ggtcagcacg 
ttccttgccg 



tcaaaatcaa 
cgcttaataa 
atggagaaaa 
tagtggagaa 
tggaaacgct 
ttacctctat 
caaacagggt 



aatggtagga 
cgccctgaaa 
gcaggtgcct 
tcacccggaa 
attacacaga 
tgaaggtttc 
ccgggccaag 



aattttcttc 
gcttttaaac 
atcccggcga 
aaagatacca 
aaccgtgcct 
agaatgcagg 
ccttga 



aatactcgaa 
atcacgaaga 
ccttcttgaa 
ccgtcactaa 
accgggcgca 
aaataaagag 



aactgactgc 
atataccgcc 
acatgctcag 
agaccagggc 
acgatctgcc 
agcaggtgac 



900 

960 

1020 

1080 

1140 

1200 

1236 



The protein or polypeptide encoded by Psy B728a EEL ORF5 has an amino acid 



sequence (SEQ. ID. No. 32) as follows: 



Met Asn lie Ser Gly Pro Asn Arg Arg Gin Gly Thr Gin Ala Glu Asn 
15 10 15 

Thr Glu Ser Ala Ser Ser Ser Ser Val Thr Asn Pro Pro Leu Gin Arg 

20 25 30 

Gly Glu Gly Arg Arg Leu Arg Arg Gin Asp Ala Leu Pro Thr Asp lie 

35 40 45 

Arg Tyr Asn Ala Asn Gin Thr Ala Thr Ser Pro Gin Asn Ala Arg Ala 
50 55 60 

Ala Gly Arg Tyr Glu Ser Gly Ala Ser Ser Ser Gly Ala Asn Asp Thr 
65 70 75 80 

Pro Gin Ala Glu Gly Ser Met Pro Ser Ser Ser Ala Leu Leu Gin Phe 

85 90 95 

Arg Leu Ala Gly Gly Arg Asn His Ser Glu Leu Glu Asn Phe His Thr 

100 105 110 

Met Met Leu Asn Ser Pro Lys Ala Ser Arg Gly Asp Ala lie Pro Glu 

115 120 125 

Lys Pro Glu Ala lie Pro Lys Arg Leu Leu Glu Lys Met Glu Pro lie 
130 135 140 

Asn Leu Ala Gin Leu Ala Leu Arg Asp Lys Asp Leu His Glu Tyr Ala 

145 150 155 160 

Val Met Val Cys Asn Gin Val Lys Lys Gly Glu Gly Pro Asn Ser Asn 

165 170 175 

lie Thr Gin Gly Asp lie Lys Leu Leu Pro Leu Phe Ala Lys Ala Glu 

180 185 190 

Asn Thr Arg Asn Pro Gly Leu Asn Leu His Thr Phe Lys Ser His Lys 

195 200 205 

Asp Cys Tyr Gin Ala lie Lys Glu Gin Asn Arg Asp lie Gin Lys Asn 
210 215 220 

Lys Gin Ser Leu Ser Met Arg Val Val Tyr Pro Pro Phe Lys Lys Met 

225 230 235 240 

Pro Asp His His lie Ala Leu Asp lie Gin Leu Arg Tyr Gly His Arg 

245 250 255 



Pro Ser lie Val Gly Phe Glu Ser Ala Pro Gly Asn lie He Asp Ala 

260 265 270 
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Ala Glu Arg Glu lie Leu Ser Ala Leu Gly Asn Val Lys lie Lys Met 

275 280 285 

Val Gly Asn Phe Leu Gin Tyr Ser Lys Thr Asp Cys Thr Met Phe Ala 
290 295 300 

Leu Asn Asn Ala Leu Lys Ala Phe Lys His His Glu Glu Tyr Thr Ala 
305 310 315 320 

Arg Leu His Asn Gly Glu Lys Gin Val Pro lie Pro Ala Thr Phe Leu 

325 330 335 

Lys His Ala Gin Ser Lys Ser Leu Val Glu Asn His Pro Glu Lys Asp 

340 345 350 

Thr Thr Val Thr Lys Asp Gin Gly Gly Leu His Met Glu Thr Leu Leu 

355 360 365 

His Arg Asn Arg Ala Tyr Arg Ala Gin Arg Ser Ala Gly Gin His Val 
370 375 380 

Thr Ser lie Glu Gly Phe Arg Met Gin Glu lie Lys Arg Ala Gly Asp 
385 390 395 400 

Phe Leu Ala Ala Asn Arg Val Arg Ala Lys Pro 

405 410 



The DNA molecule of ORF6 from the Pseudomonas syringae pv 
syringae B728a EEL has a nucleotide sequence (SEQ. ID. No. 33) as follows: 



atgacgctgg aacggattga acagcaaaat acgctgtttg tttatctgtg cgtgggcacg 60 

ctttctactc cagccagcag cacacttctg agcgatattc tggccgccaa cctctttcat 120 

tatgggtcca gcgatggggc ggccttcggg ctggacgaaa aaaataatga agtgctgctt 180 

tttcagcggt ttgatccgtt acggattgat gaggatcact ttgtcagcgc ctgcgttcag 24 0 

atgatcgaag tggcgaaaat atggcgggca aagttactgc atggccattc tgctccgctc 3 00 

gcctcctcaa ccaggctgac gaaagccggt ttaatgctaa ccatggcggg gactattcga 360 
tga 363 



The protein or polypeptide encoded by Psy B728a EEL ORF6 has an amino acid 
sequence (SEQ. ID. No. 34) as follows: 



Met Thr Leu Glu Arg lie Glu Gin Gin Asn Thr Leu Phe Val Tyr Leu 
1 5 10 .15 

Cys Val Gly Thr Leu Ser Thr Pro Ala Ser Ser Thr Leu Leu Ser Asp 

20 25 30 

He Leu Ala Ala Asn Leu Phe His Tyr Gly Ser Ser Asp Gly Ala Ala 

35 40 45 

Phe Gly Leu Asp Glu Lys Asn Asn Glu Val Leu Leu Phe Gin Arg Phe 
50 55 60 

Asp Pro Leu Arg He Asp Glu Asp His Phe Val Ser Ala Cys Val Gin 

65 70 75 80 



Met He Glu Val Ala Lys He Trp Arg Ala Lys Leu Leu His Gly His 

85 90 95 
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Ser Ala Pro Leu Ala Ser Ser Thr Arg Leu Thr Lys Ala Gly Leu Met 

100 105 110 

Leu Thr Met Ala Gly Thr lie Arg 

115 120 



The EEL of Pseudomonas syringae pv. syringae 61 contains a number 
of ORFs. One of the open reading frames encodes the outer membrane protein 
HopPsyA. The DNA molecule which encodes HopPsyA has a nucleotide sequence 
(SEQ. ID. No. 35) as follows: 



gtgaacccta 
attcaggcaa 
gcggccgctg 
ttcttcaaag 
gtactcaacg 
gatctggaga 
ctgacatcaa 
gggcgatacc 
tatgaatgcg 
aaaaccttat 
ggccgataca 
gaccaacgcg 
ggagcgcagt 
ggtaaagtcg 
aatggtgatc 
cctcctgaag 
tcttatgccg 
atcatggatg 
gaaagaggct 



tccatgcacg 
tcaaatccga 
acggctcaat 
gcgcagcgca 
agaaagcggc 
agggcggaag 
aacagacatt 
gaaatcggta 
gcagagtcaa 
catacgcccc 
tcgctgaaga 
cacctgagac 
tggccctcgc 
tcggtccggc 
ttgcaaaagc 
gattcgtcga 
agtcggttga 
ccttgaaagg 
atgacccgga 



cttctccagc 
gggtcagttg 
cgcggtcctc 
tcttattggc 
ggcagttcca 
tagcgctgtg 
tgccagcttc 
tctacatgat 
gaacattacc 
gcagatccat 
cagaaatgcc 
aaactcggga 
aatggcaacc 
aaaatatggc 
agtaaaactg 
acatacaccg 
agggcagcct 
ccagggcccc 
aaatccggcg 



gtagaagcgc 
gaagtcaacg 
agacccgatc 
ggacaaagcc 
cgcctggaca 
ggcgccgcaa 
cagcaatggg 
ctacaagagg 
tggaaacgct 
gatgatcggg 
agaaccggct 
cgacttacca 
ctgatggaca 
cagcaaactg 
ggcgaaaagc 
ctaagcatgc 
tccagccacg 
atggagaaca 
ctcagggcgc 



tcagacattc 
gcaagcgtta 
aacagtccaa 
agcgtgccca 
gaatgttggg 
tcaaggctgc 
ctgaaaaagc 
gacacgccag 
acaggctctc 
aagaggaaga 
tttttagaat 
ttggtgtaga 
agcacaaatc 
actctgccat 
tgaaaaagct 
agtcgacggg 
gacaggcgag 
gactcaaaat 
gaaactga 



aaacgttgat 60 
cgagattcgt 120 
agcagacaag 180 
aatagcccag 240 
cagacgcttc 300 
cgacagccga 3 60 
tgaggcgctc 4 20 
acacaacgcc 480 
gataacaaga 54 0 
gcttgatctg 600 
ggttcctaaa 660 
acctaaatat 720 
tgtgacacaa 780 
tctttacata 840 
gagcggtatc 900 
tctcggtctt 960 
aacacacgtt 1020 
ggcgctggca 1080 

1128 



HopPsyA has an amino acid sequence (SEQ. ID. No. 36) as follows: 



Val Asn Pro lie His Ala Arg Phe Ser Ser Val Glu Ala Leu Arg His 

15 10 15 

Ser Asn Val Asp lie Gin Ala lie Lys Ser Glu Gly Gin Leu Glu Val 

20 25 30 

Asn Gly Lys Arg Tyr Glu lie Arg Ala Ala Ala Asp Gly Ser lie Ala 

35 40 45 

Val Leu Arg Pro Asp Gin Gin Ser Lys Ala Asp Lys Phe Phe Lys Gly 
50 55 60 

Ala Ala His Leu lie Gly Gly Gin Ser Gin Arg Ala Gin lie Ala Gin 

65 70 75 80 

Val Leu Asn Glu Lys Ala Ala Ala Val Pro Arg Leu Asp Arg Met Leu 

85 90 95 



Gly Arg Arg Phe Asp Leu Glu Lys Gly Gly Ser Ser Ala Val Gly Ala 

100 105 110 



Ala lie Lys Ala Ala Asp Ser Arg Leu Thr Ser Lys Gin Thr Phe Ala 

115 120 125 
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Ser Phe Gin Gin 
130 

Asn Arg Tyr Leu 
145 

Tyr Glu Cys Gly 



Ser lie Thr Arg 

180 

Arg Glu Glu Glu 

195 



Trp Ala Glu Lys 

135 

His Asp Leu Gin 
150 

Arg Val Lys Asn 
165 

Lys Thr Leu Ser 



Glu Leu Asp Leu 

200 



Ala Glu Ala Leu 

140 

Glu Gly His Ala 

155 

lie Thr Trp Lys 
170 

Tyr Ala Pro Gin 
185 

Gly Arg Tyr lie 



Gly Arg Tyr Arg 



Arg His Asn Ala 

160 

Arg Tyr Arg Leu 

175 

lie His Asp Asp 
190 

Ala Glu Asp Arg 
205 



Asn Ala Arg Thr Gly Phe Phe Arg Met Val Pro Lys Asp Gin Arg Ala 
210 215 220 

Pro Glu Thr Asn Ser Gly Arg Leu Thr lie Gly Val Glu Pro Lys Tyr 
225 230 235 240 

Gly Ala Gin Leu Ala Leu Ala Met Ala Thr Leu Met Asp Lys His Lys 

245 250 255 

Ser Val Thr Gin Gly Lys Val Val Gly Pro Ala Lys Tyr Gly Gin Gin 

260 265 270 

Thr Asp Ser Ala lie Leu Tyr lie Asn Gly Asp Leu Ala Lys Ala Val 

275 280 285 

* 

Lys Leu Gly Glu Lys Leu Lys Lys Leu Ser Gly lie Pro Pro Glu Gly 
290 295 300 

Phe Val Glu His Thr Pro Leu Ser Met Gin Ser Thr Gly Leu Gly Leu 
305 310 315 320 



Ser Tyr Ala Glu Ser Val Glu Gly Gin Pro Ser Ser His Gly Gin Ala 

325 330 335 

Arg Thr His Val lie Met Asp Ala Leu Lys Gly Gin Gly Pro Met Glu 

340 345 350 

Asn Arg Leu Lys Met Ala Leu Ala Glu Arg Gly Tyr Asp Pro Glu Asn 

355 360 365 

Pro Ala Leu Arg Ala Arg Asn 
370 375 

The remaining open reading frame, designated shcA, is a DNA 
molecule having a nucleotide sequence (SEQ. ID. No. 37) as follows: 



atggagatgc ccgccttggc gtttgacgat 
gcattcgctc tgacgctgtt gcgcgacgac 
cttgagccac acgaggatct acccttgcag 
gtgaatgccg gccccggcat tggctgggat 
agcatcccgc gggaaaaagt cagcgtggag 
gaatggatga agtgttggcg agaagcccgc 



aagggtgcgt gcaacatgat catcgacaag 60 
acgcatcaac gtttgttgct gattggtctg 120 
cgcctgttgg ctggcgctct caaccccctt 180 
gagcaaagcg gcctgtacca cgcttaccaa 240 
atgctgaagc tcgaaattgc aggattggtc 3 00 
acgtga 336 



The encoded protein or polypeptide, ShcA, has an amino acid sequence (SEQ. ID. 
No. 38) as follows: 
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Met Glu Met Pro Ala Leu Ala Phe Asp Asp Lys Gly Ala Cys Asn Met 
15 10 15 

5 lie lie Asp Lys Ala Phe Ala Leu Thr Leu Leu Arg Asp Asp Thr His 

20 25 30 
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Gin Arg Leu Leu Leu lie Gly Leu Leu Glu Pro His Glu Asp Leu Pro 

35 40 45 

Leu Gin Arg Leu Leu Ala Gly Ala Leu Asn Pro Leu Val Asn Ala Gly 
50 55 60 



Pro Gly He Gly Trp Asp Glu Gin Ser Gly Leu Tyr His Ala Tyr Gin 
15 65 70 75 80 

Ser He Pro Arg Glu Lys Val Ser Val Glu Met Leu Lys Leu Glu He 

85 90 95 

20 Ala Gly Leu Val Glu Trp Met Lys Cys Trp Arg Glu Ala Arg Thr 

100 105 110 



In addition to the above DNA molecules and proteins or polypeptides, 
25 the present invention also relates to homologs of various DNA molecules of the 
present invention which have been isolated from other Pseudomonas syringae 
pathovars. For example, a number of AvrPphE , AvrPphF, and HopPsyA homologs 
have been identified from Pseudomonas syringae pathovars. 

The DNA molecule from Pseudomonas syringae pv. angulata which 
30 encodes an AvrPphE homolog has a nucleotide sequence (SEQ. ID. No. 39) as 
follows: 



35 



40 



45 



50 



atgagaattc 
gaaaaggctg 
cgtcctgaag 
cgcttgccac 
ggttacctgc 
ctggttccgg 
aacattgatg 
aagcgcttga 
atgaactggc 
gaacatgctc 
ccccgcgaaa 
gataattcca 
attttggcgg 
ttcacccttg 
ctgacccaca 
ccgcttgaag 
cgagtgagcg 
gctgttggtg 
ccaaaggtgg 
taa 



acagtgctgg 
ttcaatcatc 
ccggttcgac 
ccgtttcttc 
tgttacgtcg 
cagacgaagc 
tggatgcaca 
gaaaagatgc 
atgttcttgt 
gtatagcaag 
agattcattt 
gcgctggctc 
aggacagccg 
caatggcagc 
cgacaagccg 
gaggccgcta 
acaagttgaa 
ttgcaatgtc 
tcaggcaagc 



tcacagcctg 
atcggcccag 
tcaagtgcga 
tacagggcag 
gctcgaccga 
ggtgcgtgaa 
acgtacccac 
cgagcgcgct 
cgccatgtca 
cttcgcttac 
ggccgagcag 
ttcgcccatc 
gtttgccaaa 
tgaagccggc 
tctgcagaaa 
tcagcaggaa 
tagtgacgat 
gctgggtgcc 
cagaagcgtc 



cctgcgccag 
aaccccgctt 
ctgaactacc 
gccatttctg 
cgtccactgg 
gcacgccgcg 
ctgcaaagcg 
ggccatgagc 
gggcaggtgt 

ggggccctgg 

cccggaaaag 
gtcatggacc 
gatcgcagta 
aaggttacgc 
cgtcttgctg 
aagtcggtgc 
ccacggcgtg 
gaaggcgtca 
gcgtcgtcta 



gccctagcgt 
cttacagttc 
cttactcatc 
ccacgccatc 
atgaagacag 
cgttgccctt 
gcgctcgcgc 
cgatgcccgg 
ttggcgctgg 
ctcaggaaag 
atcacgtctg 
cgtggtctaa 
cggtagagcg 
gtgaaaccgc 
atcagttgcc 
ttgatgaggc 
cgttgcagat 
agacggtcgc 
aaggcatgcc 



ggaaaccact 60 
acaaacagaa 120 
agtcaagaca 180 
ttcattgccc 240 
tatcaaggct 3 00 
cggcaggggc 360 
agtcgctgca 420 
gaatgatgag 480 
caactgtggc 540 
cgggcgtagt 600 
ggctgaaacg 660 
cggcgcagcc 720 
aacatattca 780 
cgagaacgtt 84 0 
gaacgtctca 900 
gttcgcccga 960 
ggaaattgaa 1020 
ccgacaggcg 1080 
tccacgaaga 114 0 

1143 
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The amino acid sequence (SEQ. ID. No. 40) for the AvrPphE homolog of 
Pseudomonas syringae pv. angulata is as follows: 



Met Arg lie His Ser Ala Gly His Ser Leu Pro Ala Pro Gly Pro Ser 
15 10 15 

Val Glu Thr Thr Glu Lys Ala Val Gin Ser Ser Ser Ala Gin Asn Pro 

20 25 30 

Ala Ser Tyr Ser Ser Gin Thr Glu Arg Pro Glu Ala Gly Ser Thr Gin 

35 40 45 

Val Arg Leu Asn Tyr Pro Tyr Ser Ser Val Lys Thr Arg Leu Pro Pro 
50 55 60 

Val Ser Ser Thr Gly Gin Ala lie Ser Ala Thr Pro Ser Ser Leu Pro 

65 70 75 80 

Gly Tyr Leu Leu Leu Arg Arg Leu Asp Arg Arg Pro Leu Asp Glu Asp 

85 90 95 

Ser lie Lys Ala Leu Val Pro Ala Asp Glu Ala Val Arg Glu Ala Arg 

100 105 110 

Arg Ala Leu Pro Phe Gly Arg Gly Asn lie Asp Val Asp Ala Gin Arg 

115 120 125 

Thr His Leu Gin Ser Gly Ala Arg Ala Val Ala Ala Lys Arg Leu Arg 
130 135 140 

Lys Asp Ala Glu Arg Ala Gly His Glu Pro Met Pro Gly Asn Asp Glu 
145 150 155 160 

Met Asn Trp His Val Leu Val Ala Met Ser Gly Gin Val Phe Gly Ala 

165 170 175 

Gly Asn Cys Gly Glu His Ala Arg lie Ala Ser Phe Ala Tyr Gly Ala 

180 185 190 

Leu Ala Gin Glu Ser Gly Arg Ser Pro Arg Glu Lys lie His Leu Ala 

195 200 205 

Glu Gin Pro Gly Lys Asp His Val Trp Ala Glu Thr Asp Asn Ser Ser 
210 215 220 

Ala Gly Ser Ser Pro lie Val Met Asp Pro Trp Ser Asn Gly Ala Ala 

225 230 235 240 

He Leu Ala Glu Asp Ser Arg Phe Ala Lys Asp Arg Ser Thr Val Glu 

245 250 255 

Arg Thr Tyr Ser Phe Thr Leu Ala Met Ala Ala Glu Ala Gly Lys Val 

260 265 270 

Thr Arg Glu Thr Ala Glu Asn Val Leu Thr His Thr Thr Ser Arg Leu 

275 280 285 

Gin Lys Arg Leu Ala Asp Gin Leu Pro Asn Val Ser Pro Leu Glu Gly 
290 295 300 



Gly Arg Tyr Gin Gin Glu Lys Ser Val Leu Asp Glu Ala Phe Ala Arg 
305 310 315 320 
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Arg Val Ser Asp Lys Leu Asn Ser Asp Asp Pro Arg Arg Ala Leu Gin 

325 330 335 

Met Glu lie Glu Ala Val Gly Val Ala Met Ser Leu Gly Ala Glu Gly 

340 345 350 

Val Lys Thr Val Ala Arg Gin Ala Pro Lys Val Val Arg Gin Ala Arg 

355 360 365 

Ser Val Ala Ser Ser Lys Gly Met Pro Pro Arg Arg 

370 375 380 



This protein or polypeptide has GC content of about 57 percent, an estimated 
isoelectric point of about 9.5, and an estimated molecular weight of about 41 kDa. 

The DNA molecule from Pseudomonas syringae pv. glycinea which 
encodes an AvrPphE homolog has a nucleotide sequence (SEQ. ID. No. 41) as 
follows: 



atgagaattc 
gaaaaggctg 
cgtcctgaag 
cgcttgccac 
ggttacctgc 
ctggttccgg 
aacattgatg 
aagcgcttga 
atgaactggc 
gaacatgctc 
ccccgcgaaa 
gataattcca 
attttggcgg 
ttcacccttg 
ctgacccaca 
ccgcttgaag 
cgagtgagcg 
gctgttggtg 
ccaaaggtgg 
taa 



acagtgctgg 
ttcaatcatc 
ccggttcgac 
ccgtttcttc 
tgttacgtcg 
cagacgaagc 
tggatgcaca 
gaaaagatgc 
atgttcttgt 
gtatagcaag 
agattcattt 
gcgctggctc 
aggacagccg 
caatggcagc 
cgacaagccg 
gaggccgcta 
acaagttgaa 
ttgcaatgtc 
tcaggcaagc 



tcacagcctg 
atcggcccag 
tcaagtgcga 
cacagggcag 
gctcgaccga 
gttgcgtgaa 
acgtacccac 
cgagcgcgct 
cgccatgtca 
cttcgcttac 
ggccgagcag 
ttcgcccatc 
gtttgccaaa 
tgaagccggc 
tctgcagaaa 
tcagccggaa 
tagtgacgat 
gctgggtgcc 
cagaagcgtc 



cccgcgccag 
aaccccgctt 
ccgaactacc 
gccatttctg 
cgtccactgg 
gcacgccgcg 
ctgcaaagcg 
ggccatgagc 
gggcaggtgt 

ggggccctgg 
cccggaaaag 
gtcatggacc 
gatcgcagtg 
aaggttgcgc 
cgtcttgctg 
aagtcggtgc 
ccacggcgtg 
gaaggcgtca 
gcgtcgtcta 



gccctagcgt 
cttgcagttc 
cttactcatc 
acacgccatc 
atgaagacag 
cgttgccctt 
gcgctcgcgc 
cgatgcccga 
ttggcgctgg 
ctcaggaaag 
atcacgtctg 
cgtggtctaa 
cggtagagcg 
gtgaaaccgc 
atcagttgcc 
ttgatgaggc 
cgttgcagat 
agacggtcgc 
aaggcatgcc 



ggaaaccact 60 
acaaacagaa 120 
agtcaagaca 180 
ttcattgtcc 240 
tatcaaggct 300 
cggcaggggc 3 60 
agtcgctgca 420 
gaatgatgag 4 80 
caactgtggc 540 
cgggcgtagt 600 
ggctgaaacg 660 
cggcgtagcc 720 
aacatattca 780 
cgagaacgtt 84 0 
gaacgtctca 900 
gttcgcccga 960 
ggaaattgaa 1020 
ccgacaggcg 1080 
tccacgaaga 1140 

1143 



The amino acid sequence (SEQ. ID. No. 42) for the AvrPphE homolog of 
Pseudomonas syringae pv. glycinea is as follows: 



Met Arg lie His Ser Ala Gly His Ser Leu Pro Ala Pro Gly Pro Ser 

15 10 15 

Val Glu Thr Thr Glu Lys Ala Val Gin Ser Ser Ser Ala Gin Asn Pro 

20 25 30 

Ala Ser Cys Ser Ser Gin Thr Glu Arg Pro Glu Ala Gly Ser Thr Gin 

35 40 45 



Val Arg Pro Asn Tyr Pro Tyr Ser Ser Val Lys Thr Arg Leu Pro Pro 
50 55 60 



Val Ser Ser Thr Gly Gin Ala lie Ser Asp Thr Pro Ser Ser Leu Ser 
65 70 75 80 
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Gly Tyr Leu Leu Leu Arg Arg Leu Asp Arg Arg Pro Leu Asp Glu Asp 

85 90 95 

Ser lie Lys Ala Leu Val Pro Ala Asp Glu Ala Leu Arg Glu Ala Arg 

100 105 110 

Arg Ala Leu Pro Phe Gly Arg Gly Asn lie Asp Val Asp Ala Gin Arg 

115 120 125 

Thr His Leu Gin Ser Gly Ala Arg Ala Val Ala Ala Lys Arg Leu Arg 
130 135 140 

Lys Asp Ala Glu Arg Ala Gly His Glu Pro Met Pro Glu Asn Asp Glu 
145 150 155 160 

Met Asn Trp His Val Leu Val Ala Met Ser Gly Gin Val Phe Gly Ala 

165 170 175 

Gly Asn Cys Gly Glu His Ala Arg lie Ala Ser Phe Ala Tyr Gly Ala 

180 185 190 

Leu Ala Gin Glu Ser Gly Arg Ser Pro Arg Glu Lys lie His Leu Ala 

195 200 205 

Glu Gin Pro Gly Lys Asp His Val Trp Ala Glu Thr Asp Asn Ser Ser 
210 215 220 

Ala Gly Ser Ser Pro lie Val Met Asp Pro Trp Ser Asn Gly Val Ala 
225 230 235 240 

lie Leu Ala Glu Asp Ser Arg Phe Ala Lys Asp Arg Ser Ala Val Glu 

245 250 255 

Arg Thr Tyr Ser Phe Thr Leu Ala Met Ala Ala Glu Ala Gly Lys Val 

260 265 270 

Ala Arg Glu Thr Ala Glu Asn Val Leu Thr His Thr Thr Ser Arg Leu 

275 280 285 

Gin Lys Arg Leu Ala Asp Gin Leu Pro Asn Val Ser Pro Leu Glu Gly 
290 295 300 

Gly Arg Tyr Gin Pro Glu Lys Ser Val Leu Asp Glu Ala Phe Ala Arg 
305 310 315 320 

Arg Val Ser Asp Lys Leu Asn Ser Asp Asp Pro Arg Arg Ala Leu Gin 

325 330 335 

Met Glu lie Glu Ala Val Gly Val Ala Met Ser Leu Gly Ala Glu Gly 

340 345 350 

Val Lys Thr Val Ala Arg Gin Ala Pro Lys Val Val Arg Gin Ala Arg 

355 360 365 

Ser Val Ala Ser Ser Lys Gly Met Pro Pro Arg Arg 
370 375 380 



This protein or polypeptide has GC content of about 57 percent, an estimated 
isoelectric point of about 9.1, and an estimated molecular weight of about 41 kDa 
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The DNA molecule from Pseudomonas syringae pv. tabaci which 
encodes an AvrPphE homolog has a nucleotide sequence (SEQ. ID. No. 43) as 
follows: 



atgagaattc 
gaaaaggctg 
cgtcctgaag 
cgcttgccac 
ggttacctgc 
ctggttccgg 
aacattgatg 
aagcgcttga 
atgaactggc 
gaacatgctc 
ccccgcgaaa 
gataattcca 
attttggcgg 
ttcacccttg 
ctgacccaca 
ccgcttgaag 
cgagtgagcg 
gctgttggtg 
ccaaaggtgg 
taa 



acagtgctgg 
ttcaatcatc 
ccggttcgac 
ccgtttcttc 
tgttacgtcg 
cagacgaagc 
tggatgcaca 
gaaaagatgc 
atgttcttgt 
gtatagcaag 
agattcattt 
gcgctggctc 
aggacagccg 
caatggcagc 
cgacaagccg 
gaggccgcta 
acaagttgaa 
ttgcaatgtc 
tcaggcaagc 



tcacagcctg 
atcggcccag 
tcaagtgcga 
tacagggcag 
gctcgaccga 
ggtgcgtgaa 
acgtacccac 
cgagcgcgct 
cgccatgtca 
cttcgcttac 
ggccgagcag 
ttcgcccatc 
gtttgccaaa 
tgaagccggc 
tctgcagaaa 
tcagcaggaa 
tagtgacgat 
gctgggtgcc 
cagaagcgtc 



cctgcgccag 
aaccccgctt 
ccgaactacc 
gccatttctg 
cgtccactgg 
gcacgccgcg 
ctgcaaagcg 
ggccatgagc 
gggcaggtgt 
ggggccctgg 
cccggaaaag 
gtcatggacc 
gatcgcagtg 
aaggttacgc 
cgtcttgctg 
aagtcggtgc 
ccacggcgtg 
gaaggcgtca 
gcgtcgtcta 



gccctagcgt 
cttgcagttc 
cttactcatc 
acacgccatc 
atgaagacag 
cgttgccctt 
gcgctcgcgc 
cgatgcccgg 
ttggcgctgg 
ctcaggaaag 
atcacgtctg 
cgtggtctaa 
cggtagagcg 
gtgaaactgc 
atcagttgcc 
ttgatgaggc 
cgttgcagat 
agacggtcgc 
aaggcatgcc 



ggaaaccact 60 
acaaacagaa 120 
agtcaagaca 180 
ttcattgccc 240 
tatcaaggct 300 
cggcaggggc 360 
agtcgctgca 420 
gaatgatgag 480 
caactgtggc 540 
cgggcgtagt 600 
ggctgaaacg .660 
cggcgcagcc 720 
aacatattca 780 
cgagaacgtt 840 
gaacgtctca 900 
gttcgcccga 960 
ggaaattgaa 1020 
ccgacaggcg 1080 
tccacgaaga 1140 

1143 



The amino acid sequence (SEQ. ED. No. 44) for the AvrPphE homolog of 



Pseudomonas syringae pv. tabaci is as follows: 



Met Arg lie His 
1 

Val Glu Thr Thr 

20 

Ala Ser Cys Ser 

35 

Val Arg Pro Asn 
50 

Val Ser Ser Thr 
65 



Ser Ala Gly His 
5 

Glu Lys Ala Val 

Ser Gin Thr Glu 

40 

Tyr Pro Tyr Ser 

55 

Gly Gin Ala lie 
70 



Ser Leu Pro Ala 
10 

Gin Ser Ser Ser 
25 

Arg Pro Glu Ala 

Ser Val Lys Thr 

60 

Ser Asp Thr Pro 

75 



Pro Gly Pro Ser 

15 

Ala Gin Asn Pro 
30 

Gly Ser Thr Gin 
45 

Arg Leu Pro Pro 

Ser Ser Leu Pro 

80 



Gly Tyr Leu Leu 



Ser lie Lys Ala 

100 

Arg Ala Leu Pro 

115 

Thr His Leu Gin 
130 

Lys Asp Ala Glu 
145 



Leu Arg Arg Leu 
85 

Leu Val Pro Ala 

Phe Gly Arg Gly 

120 

Ser Gly Ala Arg 

135 

Arg Ala Gly His 
150 



Asp Arg Arg Pro 
90 

Asp Glu Ala Val 
105 

Asn lie Asp Val 



Ala Val Ala Ala 

140 

Glu Pro Met Pro 

155 



Leu Asp Glu Asp 

95 

Arg Glu Ala Arg 
110 

Asp Ala Gin Arg 
125 

Lys Arg Leu Arg 



Gly Asn Asp Glu 

160 
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Met Asn Trp His Val Leu Val Ala Met Ser Gly Gin Val Phe Gly Ala 

165 170 175 

Gly Asn Cys Gly Glu His Ala Arg lie Ala Ser Phe Ala Tyr Gly Ala 

180 185 190 

Leu Ala Gin Glu Ser Gly Arg Ser Pro Arg Glu Lys lie His Leu Ala 

195 200 205 

Glu Gin Pro Gly Lys Asp His Val Trp Ala Glu Thr Asp Asn Ser Ser 
210 215 220 

Ala Gly Ser Ser Pro lie Val Met Asp Pro Trp Ser Asn Gly Ala Ala 
225 230 235 240 

lie Leu Ala Glu Asp Ser Arg Phe Ala Lys Asp Arg Ser Ala Val Glu 

245 250 255 

Arg Thr Tyr Ser Phe Thr Leu Ala Met Ala Ala Glu Ala Gly Lys Val 

260 265 270 

Thr Arg Glu Thr Ala Glu Asn Val Leu Thr His Thr Thr Ser Arg Leu 

275 280 285 

Gin Lys Arg Leu Ala Asp Gin Leu Pro Asn Val Ser Pro Leu Glu Gly 
290 295 300 

Gly Arg Tyr Gin Gin Glu Lys Ser Val Leu Asp Glu Ala Phe Ala Arg 
305 310 315 320 

Arg Val Ser Asp Lys Leu Asn Ser Asp Asp Pro Arg Arg Ala Leu Gin 

325 330 335 

Met Glu lie Glu Ala Val Gly Val Ala Met Ser Leu Gly Ala Glu Gly 

340 345 350 

Val Lys Thr Val Ala Arg Gin Ala Pro Lys Val Val Arg Gin Ala Arg 

355 360 365 

Ser Val Ala Ser Ser Lys Gly Met Pro Pro Arg Arg 

370 375 380 



This protein or polypeptide has GC content of about 57 percent, an estimated 
isoelectric point of about 9.3, and an estimated molecular weight of about 41 kDa. 

Another DNA molecule from Pseudomonas syringae pv. tabaci which 
encodes a AvrPphE homolog has a nucleotide sequence (SEQ. ID. No. 45) as follows: 



atgagaattc acagtgctgg tcacagcctg cctgcgccag gccctagcgt ggaaaccact 60 

gaaaaggctg ttcaatcatc atcggcccag aaccccgctt cttgcagttc acaaacagaa 120 

cgtcctgaag ccggttcgac tcaagtgcga ccgaactacc cttactcatc agtcaagaca 180 

cgcttgccac ccgtttcttc tacagggcag gccatttctg acacgccatc ttcattgccc 240 

ggttacctgc tgttacgtcg gctcgaccga cgtccactgg atgaagacag tatcaaggct 300 

ctggttccgg cagacgaagc ggtgcgtgaa gcacgccgcg cgttgccctt cggcaggggc 360 

aacattgatg tggatgcaca acgtacccac ctgcaaagcg gcgctcgcgc agtcgctgca 420 

aagcgcttga gaaaagatgc cgagcgcgct ggccatgagc cgatgcccgg gaatgatgag 4 80 

atgaactggc atgttcttgt cgccatgtca gggcaggtgt ttggcgctgg caactgtggc 54 0 

gaacatgctc gtatagcaag cttcgcttac ggggccctgg ctcaggaaag cgggcgtagt 600 

ccccgcgaaa agattcattt ggccgagcag cccggaaaag atcacgtctg ggctgaaacg 660 

gataattcca gcgctggctc ttcgcccatc gtcatggacc cgtggtctaa cggcgcagcc 72 0 

attttggcgg aggacagccg gtttgccaaa gatcgcagtg cggtagagcg aacatattca 780 



50 



ttcacccttg caatggcagc tgaagccggc 
ctgacccaca cgacaagccg tctgcagaaa 
ccgcttgaag gaggccgcta tcagcaggaa 
cgagtgagcg acaagttgaa tagtgacgat 
gctgttggtg ttgcaatgtc gctgggtgcc 
ccaaaggtgg tcaggcaagc cagaagcgtc 
taa 



aaggttacgc gtgaaactgc cgagaacgtt 84 0 
cgtcttgctg atcagttgcc gaacgtctca 900 
aagtcggtgc ttgatgaggc gttcgcccga 960 
ccacggcgtg cgttgcagat ggaaattgaa 1020 
gaaggcgtca agacggtcgc ccgacaggcg 1080 
gcgtcgtcta aaggcatgcc tccacgaaga 114 0 

1143 



The encoded AvrPphE homolog has an amino acid sequence according to SEQ. ID 
No. 46 as follows: 



Met Arg lie His Ser Ala Gly His Ser Leu Pro Ala Pro Gly Pro Ser 
15 10 15 

Val Glu Thr Thr Glu Lys Ala Val Gin Ser Ser Ser Ala Gin Asn Pro 

20 25 30 

Ala Ser Cys Ser Ser Gin Thr Glu Arg Pro Glu Ala Gly Ser Thr Gin 

35 40 45 

Val Arg Pro Asn Tyr Pro Tyr Ser Ser Val Lys Thr Arg Leu Pro Pro 
50 55 60 

Val Ser Ser Thr Gly Gin Ala lie Ser Asp Thr Pro Ser Ser Leu Pro 
65 70 75 80 

Gly Tyr Leu Leu Leu Arg Arg Leu Asp Arg Arg Pro Leu Asp Glu Asp 

85 90 95 

Ser lie Lys Ala Leu Val Pro Ala Asp Glu Ala Val Arg Glu Ala Arg 

100 105 110 

Arg Ala Leu Pro Phe Gly Arg Gly Asn lie Asp Val Asp Ala Gin Arg 

115 120 125 

Thr His Leu Gin Ser Gly Ala Arg Ala Val Ala Ala Lys Arg Leu Arg 
130 135 140 

Lys Asp Ala Glu Arg Ala Gly His Glu Pro Met Pro Gly Asn Asp Glu 
145 150 155 160 

Met Asn Trp His Val Leu Val Ala Met Ser Gly Gin Val Phe Gly Ala 

165 170 175 

Gly Asn Cys Gly Glu His Ala Arg lie Ala Ser Phe Ala Tyr Gly Ala 

180 185 190 

Leu Ala Gin Glu Ser Gly Arg Ser Pro Arg Glu Lys lie His Leu Ala 

195 200 205 

Glu Gin Pro Gly Lys Asp His Val Trp Ala Glu Thr Asp Asn Ser Ser 
210 215 220 

Ala Gly Ser Ser Pro lie Val Met Asp Pro Trp Ser Asn Gly Ala Ala 
225 230 235 240 

lie Leu Ala Glu Asp Ser Arg Phe Ala Lys Asp Arg Ser Ala Val Glu 

245 250 255 



Arg Thr Tyr Ser Phe Thr Leu Ala Met Ala Ala Glu Ala Gly Lys Val 

260 265 270 
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Thr Arg Glu Thr Ala Glu Asn Val Leu Thr His Thr Thr Ser Arg Leu 

275 280 285 

Gin Lys Arg Leu Ala Asp Gin Leu Pro Asn Val Ser Pro Leu Glu Gly 

290 295 300 

Gly Arg Tyr Gin Gin Glu Lys Ser Val Leu Asp Glu Ala Phe Ala Arg 
305 310 315 320 

Arg Val Ser Asp Lys Leu Asn Ser Asp Asp Pro Arg Arg Ala Leu Gin 

325 330 335 

Met Glu He Glu Ala Val Gly Val Ala Met Ser Leu Gly Ala Glu Gly 

340 345 350 

Val Lys Thr Val Ala Arg Gin Ala Pro Lys Val Val Arg Gin Ala Arg 

355 360 365 

Ser Val Ala Ser Ser Lys Gly Met Pro Pro Arg Arg 
370 375 380 



A DNA molecule from Pseudomonas syringae pv. glycinea race 4 
which encodes an AvrPphE homolog has a nucleotide sequence (SEQ. ID. No. 47) 
as follows: 



atgagaattc 
gaaaaggctg 
cgtcctgaag 
cgcttgccac 
ggttacctgc 
ctggttccgg 
aacattgatg 
aagcgcttga 
atgaactggc 
gaacatgctc 
ccccgcgaaa 
gataattcca 
attttggcgg 
ttcacccttg 
ctgacccaca 
ccgcttgaag 
cgagtgagcg 
gctgttggtg 
ccaaaggtgg 
taa 



acagtgctgg 
ttcaatcatc 
ccggttcgac 
ccgtttcttc 
tgttacgtcg 
cagacgaagc 
tggatgcaca 
gaaaagatgc 
atgttcttgt 
gtatagcaag 
agattcattt 
gcgctggctc 
aggacagccg 
caatggcagc 
cgacaagccg 
gaggccgcta 
acaagttgaa 
ttgcaatgtc 
tcaggcaagc 



tcacagcctg 
atcggcccag 
tcaagtgcga 
cacagggcag 
gctcgaccga 
gttgcgtgaa 
acgtacccac 
cgagcgcgct 
cgccatgtca 
cttcgcttac 
ggccgagcag 
ttcgcccatc 
gtttgccaaa 
tgaagccggc 
tctgcagaaa 
tcagccggaa 
tagtgacgat 
gctgggtgcc 
cagaagcgtc 



cccgcgccag 
aaccccgctt 
ccgaactacc 
gccatttctg 
cgtccactgg 
gcacgccgcg 
ctgcaaagcg 
ggccatgagc 
gggcaggtgt 
ggggccctgg 
cccggaaaag 
gtcatggacc 
gatcgcagtg 
aaggttgcgc 
cgtcttgctg 
aagtcggtgc 
ccacggcgtg 
gaaggcgtca 
gcgtcgtcta 



gccctagcgt 
cttgcagttc 
cttactcatc 
acacgccatc 
atgaagacag 
cgttgccctt 
gcgctcgcgc 
cgatgcccga 
ttggcgctgg 
ctcaggaaag 
atcacgtctg 
cgtggtctaa 
cggtagagcg 
gtgaaaccgc 
atcagttgcc 
ttgatgaggc 
cgttgcagat 
agacggtcgc 
aaggcatgcc 



ggaaaccact 60 
acaaacagaa 120 
agtcaagaca 180 
ttcattgtcc 240 
tatcaaggct 300 
cggcaggggc 360 
agtcgctgca 420 
gaatgatgag 480 
caactgtggc 54 0 
cgggcgtagt 600 
ggctgaaacg 660 
cggcgtagcc 720 
aacatattca 780 
cgagaacgtt 84 0 
gaacgtctca 900 
gttcgcccga 960 
ggaaattgaa 1020 
ccgacaggcg 1080 
tccacgaaga 1140 

1143 



The encoded AvrPphE homolog has an amino acid sequence according to SEQ. ID 
No. 48 as follows: 



Met Arg lie His Ser Ala Gly His Ser Leu Pro Ala Pro Gly Pro Ser 
15 10 15 



Val Glu Thr Thr Glu Lys Ala Val Gin Ser Ser Ser Ala Gin Asn Pro 

20 25 30 



Ala Ser Cys Ser Ser Gin Thr Glu Arg Pro Glu Ala Gly Ser Thr Gin 

35 40 45 
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Val 

Val 
5 65 

Gly 
10 Ser 
Arg 

15 

Thr 

^ Lys 

^ 20 145 

IB Met 

* — ** 

v- 11 

[U 2 5 Gly 

=E Leu 

n 30 

= Glu 
|J Ala 

R 35 225 

Mi 

lie 

40 Arg 
Ala 

45 

Gin 

Gly 
50 305 

Arg 
55 Met 
Val 

60 

Ser 



Arg Pro Asn Tyr Pro 
50 

Ser Ser Thr Gly Gin 

70 

Tyr Leu Leu Leu Arg 

85 

lie Lys Ala Leu Val 

100 

Ala Leu Pro Phe Gly 
115 

His Leu Gin Ser Gly 
130 

Asp Ala Glu Arg Ala 

150 

Asn Trp His Val Leu 

165 

Asn Cys Gly Glu His 

180 

Ala Gin Glu Ser Gly 
195 

Gin Pro Gly Lys Asp 
210 

Gly Ser Ser Pro lie 

230 

Leu Ala Glu Asp Ser 

245 

Thr Tyr Ser Phe Thr 

260 

Arg Glu Thr Ala Glu 
275 

Lys Arg Leu Ala Asp 
290 

Arg Tyr Gin Pro Glu 

310 

Val Ser Asp Lys Leu 

325 

Glu He Glu Ala Val 

340 

Lys Thr Val Ala Arg 
355 

Val Ala Ser Ser Lys 
370 



Tyr Ser Ser Val Lys 
55 

Ala He Ser Asp Thr 

75 

Arg Leu Asp Arg Arg 

90 

Pro Ala Asp Glu Ala 

105 

Arg Gly Asn He Asp 
120 

Ala Arg Ala Val Ala 
135 

Gly His Glu Pro Met 

155 

Val Ala Met Ser Gly 

170 

Ala Arg He Ala Ser 

185 

Arg Ser Pro Arg Glu 
200 

His Val Trp Ala Glu 
215 

Val Met Asp Pro Trp 

235 

Arg Phe Ala Lys Asp 

250 

Leu Ala Met Ala Ala 

265 

Asn Val Leu Thr His 
280 

Gin Leu Pro Asn Val 
295 

Lys Ser Val Leu Asp 

315 

Asn Ser Asp Asp Pro 

330 

Gly Val Ala Met Ser 

345 

Gin Ala Pro Lys Val 
360 

Gly Met Pro Pro Arg 
375 



Thr Arg Leu Pro Pro 
60 

Pro Ser Ser Leu Ser 

80 

Pro Leu Asp Glu Asp 

95 

Leu Arg Glu Ala Arg 

110 

Val Asp Ala Gin Arg 
125 

Ala Lys Arg Leu Arg 
140 

Pro Glu Asn Asp Glu 

160 

Gin Val Phe Gly Ala 

175 

Phe Ala Tyr Gly Ala 

190 

Lys He His Leu Ala 
205 

Thr Asp Asn Ser Ser 
220 

Ser Asn Gly Val Ala 

240 

Arg Ser Ala Val Glu 

255 

Glu Ala Gly Lys Val 

270 

Thr Thr Ser Arg Leu 
285 

Ser Pro Leu Glu Gly 
300 

Glu Ala Phe Ala Arg 

320 

Arg Arg Ala Leu Gin 

335 

Leu Gly Ala Glu Gly 

350 

Val Arg Gin Ala Arg 
365 

Arg 
380 
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A DNA molecule from Pseudomonas syringae pv. phaseolicola 

* 

strain B130 which encodes AvrPphE has a nucleotide sequence (SEQ. ID. No. 49) 
as follows: 



10 



15 



* - ** 



* ■> 



20 



25 



atgagaattc 
gaaaaggctg 
cgtcctgaag 
cgcttgccac 
ggttacctgc 
ctggttccgg 
aacattgatg 
aagcgcttga 
atgaactggc 
gaacatgctc 
ccccgcgaaa 
gataattcca 
attttggcgg 
ttcacccttg 
ctgacccaca 
ccgcttgaag 
cgagtgagcg 
gctgttggtg 
ccaaaggtgg 
taa 



acagtgctgg 
ttcaatcatc 
ccggttcgac 
ccgtttcttc 
tgttacgtcg 
cagacgaagc 
tggatgcaca 
gaaaagatgc 
atgttcttgt 
gtatagcaag 
agattcattt 
gcgctggctc 
aggacagccg 
caatggcagc 
cgacaagccg 
gaggccgcta 
acaagttgaa 
ttgcaatgtc 
tcaggcaagc 



tcacagcctg 
atcggcccag 
tcaagtgcga 
cacagggcag 
gctcgaccga 
gttgcgtgaa 
acgtacccac 
cgagcgcgct 
cgccatgtca 
cttcgcttac 
ggccgagcag 
ttcgcccatc 
gtttgccaaa 
tgaagccggc 
tctgcagaag 
tcagccggaa 
tagtgacgat 
gctgggtgcc 
cagaagcgtc 



cccgcgccag 
aaccccgctt 
ccgaactacc 
gccatttctg 
cgtccactgg 
gcacgccgcg 
ctgcaaagcg 
ggccatgagc 
gggcaggtgt 
ggggccctgg 
cccggaaaag 
gtcatggacc 
gatcgcagtg 
aaggttgcgc 
cgtcttgctg 
aagtcggtgc 
ccacggcgtg 
gaaggcgtca 
gcgtcgtcta 



gccctagcgt 
cttgcagttc 
cttactcatc 
acacgccatc 
atgaagacag 
cgttgccctt 
gcgctcgcgc 
cgatgcccga 
ttggcgctgg 
ctcaggaaag 
atcacgtctg 
cgtggtctaa 
cggtagagcg 
gtgaaaccgc 
atcagttgcc 
ttgatgaggc 
cgttgcagat 
agacggtcgc 
aaggcatgcc 



ggaaaccact 60 
acaaacagaa 120 
agtcaagaca 180 
ttcattgccc 240 
tatcaaggct 300 
cggcaggggc 360 
agtcgctgca 420 
gaatgatgag 480 
caactgtggc 540 
cgggcgtagt 600 
ggctgaaacg 660 
cggcgcagcc 720 
aacatattca 780 
cgagaacgtt 840 
gaacgtctca 900 
gttcgcccga 960 
ggaaattgaa 1020 
ccgacaggcg 1080 
tccacgaaga 114 0 

1143 



The encoded AvrPphE homolog has an amino acid sequence according to SEQ. ID 
No. 50 as follows: 



30 



35 



Met Arg lie His Ser Ala Gly His Ser Leu Pro Ala Pro Gly Pro Ser 
15 10 15 

Val Glu Thr Thr Glu Lys Ala Val Gin Ser Ser Ser Ala Gin Asn Pro 

20 25 30 

Ala Ser Cys Ser Ser Gin Thr Glu Arg Pro Glu Ala Gly Ser Thr Gin 

35 40 45 



40 



Val Arg Pro Asn Tyr Pro Tyr Ser Ser Val Lys Thr Arg Leu Pro Pro 
50 55 60 



45 



Val Ser Ser Thr Gly Gin Ala lie Ser Asp Thr Pro Ser Ser Leu Pro 
65 70 75 80 

Gly Tyr Leu Leu Leu Arg Arg Leu Asp Arg Arg Pro Leu Asp Glu Asp 

85 90 95 



50 



Ser lie Lys Ala Leu Val Pro Ala Asp Glu Ala Leu Arg Glu Ala Arg 

100 105 110 

Arg Ala Leu Pro Phe Gly Arg Gly Asn lie Asp Val Asp Ala Gin Arg 

115 120 125 



55 



Thr His Leu Gin Ser Gly Ala Arg Ala Val Ala Ala Lys Arg Leu Arg 
130 135 140 



Lys Asp Ala Glu Arg Ala Gly His Glu Pro Met Pro Glu Asn Asp Glu 
145 150 155 160 



Met Asn Trp His 



Gly Asn Cys Gly 

180 

Leu Ala Gin Glu 

195 

Glu Gin Pro Gly 
210 

Ala Gly Ser Ser 
225 

lie Leu Ala Glu 



Arg Thr Tyr Ser 

260 

Ala Arg Glu Thr 

275 

Gin Lys Arg Leu 
290 

Gly Arg Tyr Gin 
305 

Arg Val Ser Asp 



Met Glu lie Glu 

340 

Val Lys Thr Val 

355 

Ser Val Ala Ser 
370 



Val Leu Val Ala 
165 

Glu His Ala Arg 



Ser Gly Arg Ser 

200 

Lys Asp His Val 

215 

Pro lie Val Met 
230 

Asp Ser Arg Phe 
245 

Phe Thr Leu Ala 



Ala Glu Asn Val 

280 

Ala Asp Gin Leu 

295 

Pro Glu Lys Ser 
310 

Lys Leu Asn Ser 
325 

Ala Val Gly Val 



Ala Arg Gin Ala 

360 

Ser Lys Gly Met 

375 



-54- 

Met Ser Gly Gin 
170 

lie Ala Ser Phe 
185 

Pro Arg Glu Lys 



Trp Ala Glu Thr 

220 

Asp Pro Trp Ser 

235 

Ala Lys Asp Arg 
250 

Met Ala Ala Glu 
265 

Leu Thr His Thr 



Pro Asn Val Ser 

300 

Val Leu Asp Glu 

315 

Asp Asp Pro Arg 
330 

Ala Met Ser Leu 
345 

Pro Lys Val Val 



Pro Pro Arg Arg 

380 



Val Phe Gly Ala 

175 

Ala Tyr Gly Ala 
190 

lie His Leu Ala 
205 

Asp Asn Ser Ser 



Asn Gly Ala Ala 

240 

Ser Ala Val Glu 

255 

Ala Gly Lys Val 
270 

Thr Ser Arg Leu 
285 

Pro Leu Glu Gly 



Ala Phe Ala Arg 

320 

Arg Ala Leu Gin 

335 

Gly Ala Glu Gly 
350 

Arg Gin Ala Arg 
365 



A DNA molecule from Pseudomonas syringae pv. angulata strain 
Pa9 which encodes an AvrPphE homolog has a nucleotide sequence (SEQ. ED. 
No. 51) as follows: 



tr" 



atgagaattc 
gaaaaggctg 
cgtcctgaag 
cgcttgccac 
ggttacctgc 
ctggttccgg 
aacattgatg 
aagcgcttga 
atgaactggc 
gaacatgctc 
ccccgcgaaa 
gataattcca 
attttggcgg 
ttcacccttg 
ctgacccaca 



acagtgctgg 
ttcaatcatc 
ccggttcgac 
ccgtttcttc 
tgttacgtcg 
cagacgaagc 
tggatgcaca 
gaaaagatgc 
atgttcttgt 
gtatagcaag 
agattcattt 
gcgctggctc 
aggacagccg 
caatggcagc 
cgacaagccg 



tcacagcctg 
atcggcccag 
tcaagtgcga 
tacagggcag 
gctcgaccga 
ggtgcgtgaa 
acgtacccac 
cgagcgcgct 
cgccatgtca 
cttcgcttac 
ggccgagcag 
ttcgcccatc 
gtttgccaaa 
tgaagccggc 
tctgcagaaa 



cctgcgccag 
aaccccgctt 
ctgaactacc 
gccatttctg 
cgtccactgg 
gcacgccgcg 
ctgcaaagcg 
ggccatgagc 
gggcaggtgt 
ggggccctgg 
cccggaaaag 
gtcatggacc 
gatcgcagta 
aaggttacgc 
cgtcttgctg 



gccctagcgt 
cttacagttc 
cttactcatc 
ccacgccatc 
atgaagacag 
cgttgccctt 
gcgctcgcgc 
cgatgcccgg 
ttggcgctgg 
ctcaggaaag 
atcacgtctg 
cgtggtctaa 
cggtagagcg 
gtgaaaccgc 
atcagttgcc 



ggaaaccact 60 
acaaacagaa 120 
agtcaagaca 180 
ttcattgccc 240 
tatcaaggct 300 
cggcaggggc 360 
agtcgctgca 420 
gaatgatgag 4 80 
caactgtggc 540 
cgggcgtagt 600 
ggc tgaaacg 660 
cggcgcagcc 720 
aacatattca 780 
cgagaacgtt 840 
gaacgtctca 900 



55 



ccgcttgaag gaggccgcta tcagcaggaa 
cgagtgagcg acaagttgaa tagtgacgat 
gctgttggtg ttgcaatgtc gctgggtgcc 
ccaaaggtgg tcaggcaagc cagaagcgtc 
taa 



aagtcggtgc ttgatgaggc gttcgcccga 960 
ccacggcgtg cgttgcagat ggaaattgaa 1020 
gaaggcgtca agacggtcgc ccgacaggcg 1080 
gcgtcgtcta aaggcatgcc tccacgaaga 114 0 

1143 



The encoded AvrPphE homolog has an amino acid sequence according to SEQ. ID 
No. 52 as follows: 



Met Arg lie His Ser Ala Gly His Ser Leu Pro Ala Pro Gly Pro Ser 
15 10 15 

Val Glu Thr Thr Glu Lys Ala Val Gin Ser Ser Ser Ala Gin Asn Pro 

20 25 30 

Ala Ser Tyr Ser Ser Gin Thr Glu Arg Pro Glu Ala Gly Ser Thr Gin 

35 40 45 

Val Arg Leu Asn Tyr Pro Tyr Ser Ser Val Lys Thr Arg Leu Pro Pro 
50 55 60 

Val Ser Ser Thr Gly Gin Ala lie Ser Ala Thr Pro Ser Ser Leu Pro 

65 70 75 80 

Gly Tyr Leu Leu Leu Arg Arg Leu Asp Arg Arg Pro Leu Asp Glu Asp 

85 90 95 

Ser lie Lys Ala Leu Val Pro Ala Asp Glu Ala Val Arg Glu Ala Arg 

100 105 110 

Arg Ala Leu Pro Phe Gly Arg Gly Asn lie Asp Val Asp Ala Gin Arg 

115 120 125 

Thr His Leu Gin Ser Gly Ala Arg Ala Val Ala Ala Lys Arg Leu Arg 
130 135 140 

Lys Asp Ala Glu Arg Ala Gly His Glu Pro Met Pro Gly Asn Asp Glu 
145 150 155 160 

Met Asn Trp His Val Leu Val Ala Met Ser Gly Gin Val Phe Gly Ala 

165 170 175 

Gly Asn Cys Gly Glu His Ala Arg He Ala Ser Phe Ala Tyr Gly Ala 

180 185 190 

Leu Ala Gin Glu Ser Gly Arg Ser Pro Arg Glu Lys He His Leu Ala 

195 200 205 

Glu Gin Pro Gly Lys Asp His Val Trp Ala Glu Thr Asp Asn Ser Ser 
210 215 220 

* 

Ala Gly Ser Ser Pro He Val Met Asp Pro Trp Ser Asn Gly Ala Ala 

225 230 235 240 

He Leu Ala Glu Asp Ser Arg Phe Ala Lys Asp Arg Ser Thr Val Glu 

245 250 255 

Arg Thr Tyr Ser Phe Thr Leu Ala Met Ala Ala Glu Ala Gly Lys Val 

260 265 270 



Thr Arg Glu Thr Ala Glu Asn Val Leu Thr His Thr Thr Ser Arg Leu 

275 280 285 
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Gin Lys Arg Leu Ala Asp Gin Leu Pro Asn Val Ser Pro Leu Glu Gly 
290 295 300 

Gly Arg Tyr Gin Gin Glu Lys Ser Val Leu Asp Glu Ala Phe Ala Arg 
305 310 315 320 

Arg Val Ser Asp Lys Leu Asn Ser Asp Asp Pro Arg Arg Ala Leu Gin 

325 330 335 

Met Glu lie Glu Ala Val Gly Val Ala Met Ser Leu Gly Ala Glu Gly 

340 345 350 

Val Lys Thr Val Ala Arg Gin Ala Pro Lys Val Val Arg Gin Ala Arg 

355 360 365 

Ser Val Ala Ser Ser Lys Gly Met Pro Pro Arg Arg 
370 375 380 



A DNA molecule from Pseudomonas syringae pv. delphinii strain 
PDDCC529 which encodes a AvrPphE homolog has a nucleotide sequence (SEQ 
ID. No. 53) as follows: 



atgaaaatac 
ggcaagactg 
ccatcagaga 
gtcaaaacac 
tcattacccg 
atcaaaggtt 
ggcaggggca 
ctcgcggcta 
aatgaagata 
aactgcgggg 
gggcggaacg 
gctgaaacgg 
ggtcctgcca 
acggattcct 
gagaatgctt 
caagtctcgc 
ttcgcccgac 
gaaatcgagg 
gaacaggccc 
cagcgagata 



ataacgctgg 
cgcaatcatc 
cttctgatgc 
ggttgcctcc 
gctacttgct 
tgattccagc 
atatcgacgt 
ggcgtttgag 
tgaactggca 
aacatgcccg 
ccgatgagac 
acaattcaag 
tttttgcgga 
tcacgcttgc 
tgacacaggc 
cgcttgcagg 
gggcaagtgg 
cggccgcagt 
ggacggtagt 
cgtga 



cccaagcatt 
attggctcaa 
ccgtccgtcc 
cgttgcgtct 
gttacgtcgg 
agatgaagcg 
ggatgcgcaa 
aaaagatgcc 
tgttcttgtt 
catagcgagt 
tattcatttg 
cgctggatct 
ggatagtcgg 
aactgctgct 
gaccagccgt 
agggcgctat 
caagttgagc 
tgcaatgtcg 
tgaacaagcc 



ccgatgcccg 
ccgcagagcc 
agtgtgcgta 
gcagggcagc 
cttgaccatc 
gtgggtgaag 
cgctccaact 
gaggccgcgg 
gcgatgtcag 
ttcgcctacg 
gctgcgcaac 
tcaccggttg 
tttgccaaag 
gaagcaggca 
ttgcagaaac 
cggcaagaaa 
aacaaggatc 
ctgggcgccc 
aggaaggtcg 



ctccatcgat 
aacgagccac 
cgaactaccc 
cactgtccgg 
gtccactgga 
cacgtcgcgc 
tggaaagcgg 
gtcacgaacc 
gacaggtttt 
gtgcactggc 
gcggtaaaga 
tcatggatcc 
atcgaagtac 
agatcacgcg 
gtcttgctga 
attcggtgct 
cgcggcatgc 
aaggcgtaaa 
catctcccca 



tgagagcgct 6 0 
ccccgtctcg 120 
ttattcatca 180 
gatgccgtct 240 
tcaagacggt 3 00 
gttgcctttc 360 
agcccgcaca 420 
aatgcctgca 4 80 
tggcgcaggt 54 0 
tcaggaaaaa 600 
ccacgtctgg 660 
gtggtcgaac 720 
ggtagaacga 780 
agagacggcc 84 0 
tcagaaaacg 900 
tgatgacgcg 960 
attacaggtg 1020 
agcggttgcg 1080 
aggcacgcct 1140 

1155 



The encoded AvrPphE homolog has an amino acid sequence according to SEQ. ID. 
No. 54 as follows: 

Met Lys lie His Asn Ala Gly Pro Ser lie Pro Met Pro Ala Pro Ser 
15 10 15 

lie Glu Ser Ala Gly Lys Thr Ala Gin Ser Ser Leu Ala Gin Pro Gin 

20 25 30 

Ser Gin Arg Ala Thr Pro Val Ser Pro Ser Glu Thr Ser Asp Ala Arg 

35 40 45 



Pro Ser Ser Val Arg Thr Asn Tyr Pro Tyr Ser Ser Val Lys Thr Arg 
50 55 60 



-57- 



Leu Pro Pro Val 
65 

Ser Leu Pro Gly 



Asp Gin Asp Gly 

100 

Glu Ala Arg Arg 

115 

Ala Gin Arg Ser 
130 

Arg Leu Arg Lys 
145 

Asn Glu Asp Met 



Phe Gly Ala Gly 

180 

Tyr Gly Ala Leu 

195 

His Leu Ala Ala 
210 

Asn Ser Ser Ala 
225 

Gly Pro Ala He 



Thr Val Glu Arg 

260 

Gly Lys He Thr 

275 

Ser Arg Leu Gin 
290 

Leu Ala Gly Gly 
305 

Phe Ala Arg Arg 



Ala Leu Gin Val 

340 

Ala Gin Gly Val 

355 

Gin Ala Arg Lys 
370 



Ala Ser Ala Gly 
70 

Tyr Leu Leu Leu 
85 

He Lys Gly Leu 



Ala Leu Pro Phe 

120 

Asn Leu Glu Ser 

135 

Asp Ala Glu Ala 
150 

Asn Trp His Val 
165 

Asn Cys Gly Glu 



Ala Gin Glu Lys 

200 

Gin Arg Gly Lys 

215 

Gly Ser Ser Pro 
230 

Phe Ala Glu Asp 
245 

Thr Asp Ser Phe 



Arg Glu Thr Ala 

280 

Lys Arg Leu Ala 

295 

Arg Tyr Arg Gin 
310 

Ala Ser Gly Lys 
325 

Glu He Glu Ala 



Lys Ala Val Ala 

360 

Val Ala Ser Pro 

375 



Gin Pro Leu Ser 

75 

Arg Arg Leu Asp 
90 

He Pro Ala Asp 
105 

Gly Arg Gly Asn 



Gly Ala Arg Thr 

140 

Ala Gly His Glu 

155 

Leu Val Ala Met 
170 

His Ala Arg He 
185 

Gly Arg Asn Ala 



Asp His Val Trp 

220 

Val Val Met Asp 

235 

Ser Arg Phe Ala 
250 

Thr Leu Ala Thr 
265 

Glu Asn Ala Leu 



Asp Gin Lys Thr 

300 

Glu Asn Ser Val 

315 

Leu Ser Asn Lys 
330 

Ala Ala Val Ala 
345 

Glu Gin Ala Arg 



Gin Gly Thr Pro 

380 



Gly Met Pro Ser 

80 

His Arg Pro Leu 

95 

Glu Ala Val Gly 
110 

He Asp Val Asp 
125 

Leu Ala Ala Arg 



Pro Met Pro Ala 

160 

Ser Gly Gin Val 

175 

Ala Ser Phe Ala 
190 

Asp Glu Thr lie 
205 

Ala Glu Thr Asp 



Pro Trp Ser Asn 

240 

Lys Asp Arg Ser 

255 

Ala Ala Glu Ala 
270 

Thr Gin Ala Thr 
285 

Gin Val Ser Pro 



Leu Asp Asp Ala 

320 

Asp Pro Arg His 

335 

Met Ser Leu Gly 
350 

Thr Val Val Glu 
365 

Gin Arg Asp Thr 
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A DNA molecule from Pseudomonas syringae pv. delphinii strain 
PDDCC529 which encodes a homolog of P. syringae pv. tomato DC3000 EEL 
ORF2 has a nucleotide sequence (SEQ. ID. No. 55) as follows: 



gtggttgagc 
agccaaaatc 
ctattggctt 
agtgatattc 
tacacctcaa 
aaaggtcaac 
atgttggtgg 
tgttttgctt 
gtgaggaatc 
cactttttca 
cagataagcc 
gtctatctgc 
gtcgacagtc 
cttcccgggc 
ttgcgaaatg 
tatgtatcgg 



gaaccggcac 
aggtccgacg 
tggcctttgc 
agggtgccca 
aaaaacttga 
tgatcgacct 
gcacagagga 
atctggatta 
tcgttcaggt 
cggattgggc 
ccggtgcggt 
caggtttgcc 
aggtggtaag 
tggatgtgac 
catcttcacg 
aaaagccagg 



tgcatatcga 
acgctttgga 
aatcctggca 
ggcagagatg 
tgctgtgttg 
tgtgtcaggg 
aatacctgaa 
cgtagaggcg 
tcgttacaag 
ttatggcact 
aagtgtcaga 
tgtggttgag 
ccacttgcgc 
gcacgtcggt 
aaaagaaaac 
gattgttgtt 



aggcgtggag 
attacggtga 
gggtgtgggg 
aaaacaccca 
gaagctcggg 
gcgtttttgg 
cagttagtca 
ttgcgaagat 
ggtggtgatg 
acacacccgg 
aaacgcctta 
cgcagcatga 
acaggtgatt 
ttctttatca 
agaaaggtaa 
ttcagggcaa 



cagcctgctc 
atcagatgca 
gttcggggca 
ttaaagtaga 
ccaataaaag 
gaacaccgta 
tcgactttag 
caacatcgca 
ttgacttttt 
tggcggatga 
atgaaagggc 
cctatatccc 
acatcggcat 
tgacggataa 
tggatttgcc 
aagacaattg 



gcgtatcacg 60 
aaagacgtcc 120 
ggcgccgggg 180 
tctggatgcc 240 
ctatgtgaat 300 
ccgctcaaac 360 
aggtctggat 420 
gcaggatttt 480 
gaatcgcaag 540 
catcaccacg 600 
caaaggcaaa 660 
gagccgcctt 720 
ttacaccccg 780 
aggccctgtc 84 0 
ttttctggac 900 
a 951 



The encoded protein or polypeptide has an amino acid sequence according to SEQ 
ID. No. 56 as follows: 



Val Val Glu Arg Thr Gly Thr Ala Tyr Arg Arg Arg Gly Ala Ala Cys 
15 10 15 

Ser Arg lie Thr Ser Gin Asn Gin Val Arg Arg Arg Phe Gly lie Thr 

20 25 30 

Val Asn Gin Met Gin Lys Thr Ser Leu Leu Ala Leu Ala Phe Ala lie 

35 40 45 

Leu Ala Gly Cys Gly Gly Ser Gly Gin Ala Pro Gly Ser Asp lie Gin 
50 55 60 

Gly Ala Gin Ala Glu Met Lys Thr Pro lie Lys Val Asp Leu Asp Ala 

65 70 75 80 

Tyr Thr Ser Lys Lys Leu Asp Ala Val Leu Glu Ala Arg Ala Asn Lys 

85 90 95 

Ser Tyr Val Asn Lys Gly Gin Leu lie Asp Leu Val Ser Gly Ala Phe 

100 105 110 

Leu Gly Thr Pro Tyr Arg Ser Asn Met Leu Val Gly Thr Glu Glu lie 

115 120 125 

Pro Glu Gin Leu Val lie Asp Phe Arg Gly Leu Asp Cys Phe Ala Tyr 
130 135 140 

Leu Asp Tyr Val Glu Ala Leu Arg Arg Ser Thr Ser Gin Gin Asp Phe 
145 150 155 160 



Val Arg Asn Leu Val Gin Val Arg Tyr Lys Gly Gly Asp Val Asp Phe 

165 170 175 



Leu Asn Arg Lys His Phe Phe Thr Asp Trp Ala Tyr Gly Thr Thr His 

180 185 190 
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Pro Val Ala Asp Asp lie Thr Thr Gin lie Ser Pro Gly Ala Val Ser 

195 200 205 

Val Arg Lys Arg Leu Asn Glu Arg Ala Lys Gly Lys Val Tyr Leu Pro 
210 215 220 

Gly Leu Pro Val Val Glu Arg Ser Met Thr Tyr lie Pro Ser Arg Leu 
225 230 235 240 

Val Asp Ser Gin Val Val Ser His Leu Arg Thr Gly Asp Tyr lie Gly 

245 250 255 

He Tyr Thr Pro Leu Pro Gly Leu Asp Val Thr His Val Gly Phe Phe 

260 265 270 

He Met Thr Asp Lys Gly Pro Val Leu Arg Asn Ala Ser Ser Arg Lys 

275 280 285 

Glu Asn Arg Lys Val Met Asp Leu Pro Phe Leu Asp Tyr Val Ser Glu 

290 295 300 

Lys Pro Gly lie Val Val Phe Arg Ala Lys Asp Asn 
305 310 315 



A DNA molecule from Pseudomonas syringae pv. delphinii strain 
PDDCC529 ORF1 encodes a homolog of AvrPphF and has a nucleotide sequence 
(SEQ. ID. No. 57) as follows: 



atgaaaaact catttgatct tcttgtcgac 
ttgccgaaca agaaacacga caatgaagtc 
gtaaacattt atcaggacga ctgtcgatgg 
caagacgcca gcaatgacac gctcagccac 
aagcccttct tcacctttgg aatgaacgga 
ccgttgattg aaatgaatac cgttgaaatg 
gcaggcggca tcagagcgac attcaagctc 



ggtttggcga aagactacag catgccgaat 60 
tattgcttca cattccagag cgggctcgaa 120 
gtgcatttct ccgccacaat cggacaattt 180 
gcacttcaac tgaacaattt cagtcttgga 240 
gaaaaggtcg gcgtacttca cacacgcgtt 3 00 
cgcaaggtat tcgaggactt gctcgatgta 360 
agttaa 396 



The encoded AvrPhpF homolog has an amino acid sequence according to SEQ. ID. 
No. 58 as follows: 



Met Lys Asn Ser 
1 

Ser Met Pro Asn 

20 

Phe Thr Phe Gin 

35 

Arg Trp Val His 
50 

Asn Asp Thr Leu 
65 

Lys Pro Phe Phe 



Phe Asp Leu Leu 

5 

Leu Pro Asn Lys 

Ser Gly Leu Glu 

40 

Phe Ser Ala Thr 

55 

Ser His Ala Leu 
70 

Thr Phe Gly Met 
85 



Val Asp Gly Leu 
10 

Lys His Asp Asn 
25 

Val Asn lie Tyr 

He Gly Gin Phe 

60 

Gin Leu Asn Asn 

75 

Asn Gly Glu Lys 
90 



Ala Lys Asp Tyr 

15 

Glu Val Tyr Cys 
30 

Gin Asp Asp Cys 
45 

Gin Asp Ala Ser 

Phe Ser Leu Gly 

80 

Val Gly Val Leu 

95 
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His Thr Arg Val Pro Leu lie Glu Met Asn Thr Val Glu Met Arg Lys 

100 105 110 

Val Phe Glu Asp Leu Leu Asp Val Ala Gly Gly lie Arg Ala Thr Phe 

115 120 125 

Lys Leu Ser 
130 



A DNA molecule from Pseudomonas syringae pv. delphinii strain 
PDDCC529 ORF1 encodes a homolog of AvrPphF and has a nucleotide sequence 
(SEQ. ID. No. 59) as follows: 



atgagtacta tacctggcac ctcgggcgct cacccgattt atagctcaat ttccagccca 60 
cgaaatatgt ctggctcgcc cacaccgagt caccgtattg gcggggaaac cctgacctct 120 
attcatcagc tctctgccag ccagagagaa caatttctga atactcatga ccccatgaga 180 
aaactcagga ttaacaatga tacgccactg tacagaacaa ccgagaagcg ttttatacag 24 0 
gaaggcaaac tggccggcaa tccaaagtct attgcacgtg tcaacttgca cgaagaactg 3 00 
cagcttaatc cgctcgccag tattttaggg aacttacctc acgaggcaag cgcttacttt 360 
ccgaaaagcg cccgcgctgc ggatctgaaa gacccttcat tgaatgtaat gacaggctct 420 
cgggcaaaaa atgctattcg cggctacgct catgacgacc atgtggcggt caagatgcga 480 
ctgggcgact ttcttgaaaa aggcggcaag gtgtacgcgg acacttcatc agtcattgac 54 0 
99 c 99 a 9 ac 9 aggcgagcgc gctgatcgtt acattgccta aaggacaaaa agttccagtc 600 
gagattatcc ctacccataa cgacaacagc aataaaggca gaggctga 64 8 



The encoded AvrPphF homolog has an amino acid sequence according to SEQ. ID 
No. 60 as follows: 



Met Ser Thr lie Pro Gly Thr Ser Gly Ala His Pro lie Tyr Ser Ser 
15 10 15 

lie Ser Ser Pro Arg Asn Met Ser Gly Ser Pro Thr Pro Ser His Arg 

20 25 30 

lie Gly Gly Glu Thr Leu Thr Ser lie His Gin Leu Ser Ala Ser Gin 

35 40 45 

Arg Glu Gin Phe Leu Asn Thr His Asp Pro Met Arg Lys Leu Arg lie 
50 55 60 

Asn Asn Asp Thr Pro Leu Tyr Arg Thr Thr Glu Lys Arg Phe lie Gin 
65 70 75 80 

Glu Gly Lys Leu Ala Gly Asn Pro Lys Ser lie Ala Arg Val Asn Leu 

85 90 95 

His Glu Glu Leu Gin Leu Asn Pro Leu Ala Ser lie Leu Gly Asn Leu 

100 105 110 

Pro His Glu Ala Ser Ala Tyr Phe Pro Lys Ser Ala Arg Ala Ala Asp 

115 120 125 

Leu Lys Asp Pro Ser Leu Asn Val Met Thr Gly Ser Arg Ala Lys Asn 
130 135 140 



Ala lie Arg Gly Tyr Ala His Asp Asp His Val Ala Val Lys Met Arg 
145 150 155 160 



61 



Leu Gly Asp Phe Leu Glu Lys Gly Gly Lys Val Tyr Ala Asp Thr Ser 

165 170 175 

Ser Val lie Asp Gly Gly Asp Glu Ala Ser Ala Leu lie Val Thr Leu 

180 185 190 



10 



Pro Lys Gly Gin Lys Val Pro Val Glu lie lie Pro Thr His Asn Asp 

195 200 205 

Asn Ser Asn Lys Gly Arg Gly 
210 215 



A DNA molecule from Pseudomonas syringae pv. syringae strain 
15 226 encodes a homo log of HopPsyA and has a nucleotide sequence (SEQ. ID. 
No. 61) as follows: 



20 



X E £ 



25 



£ = 



30 



35 



gtgaacccta 
attcaggcaa 
gcggccgctg 
ttcttcaaag 
gtactcaacg 
gatctggaga 
ctgacatcaa 
gggcgcgata 
cctatgaatg 
aaaaacctta 
ggccgataca 
gaccaacgcg 
ggagcgcagt 
ggtaaagtcg 
aatggtgatc 
cctcctgaag 
tcttatgccg 
atcatggatg 
gaaagaggct 



tccatgcacg 
tcaaatccga 
acggctcaat 
gcgcagcgca 
agaaagcggc 
agggcggaag 
aacagacatt 
ccgaaatcgg 
cggcagagca 
tcatacgccc 
tcgctgaaga 
cacctgagac 
tggccctcgc 
tcggtccggc 
ttgcaaaagc 
gattcgtcga 
agtcggttga 
ccttgaaagg 
atgacccgga 



cttctccagc 
gggtcagttg 
cgcggtcctc 
tcttattggc 
ggcagttcca 
tagcgctgtg 
tgccagcttc 
tatctacatg 
agaacattac 
gcagatccat 
cagaaatgcc 
aaactcggga 
aatggcaacc 
aaaatatggc 
agtaaaactg 
acatacaccg 
agggcagcct 
ccagggcccc 
aaatccggcg 



gtagaagcgc 
gaagtcaacg 
agacccgatc 
ggacaaagcc 
cgcctggaca 
ggcgccgcaa 
cagcaatggg 
atctacaaga 
ctggaaacgc 
gatgatcggg 
agaaccggct 
cgacttacca 
ctgatggaca 
cagcaaactg 
ggcgaaaagc 
ctaagcatgc 
tccagccacg 
atggagaaca 
ctcagggcgc 



tcagacattc 
gcaagcgtta 
aacagtccaa 
agcgtgccca 
gaatgttggg 
tcaaggctgc 
ctgaaaaagc 
gggacacgcc 
tacaggctct 
aagaggaaga 
tttttagaat 
ttggtgtaga 
agcacaaatc 
actctgccat 
tgaaaaagct 
agtcgacggg 
gacaggcgag 
gactcaaaat 
gaaactga 



aaacgttgat 6 0 
cgagattcgt 120 
agcagacaag 180 
aatagcccag 24 0 
cagacgcttc 3 00 
cgacagccga 360 
tgaggcgctc 4 20 
agacacaacg 4 80 
cgataacaag 540 
gcttgatctg 600 
ggttcctaaa 660 
acctaaatat 720 
tgtgacacaa 780 
tctttacata 840 
gagcggtatc 900 
tctcggtctt 960 
aacacacgtt 1020 
ggcgctggca 1080 

1128 



The encoded HopPsyA homo log has an amino acid sequence according to SEQ. ID 
40 No. 62 as follows: 



45 



Val Asn Pro lie His Ala Arg Phe Ser Ser Val Glu Ala Leu Arg His 

15 10 15 

Ser Asn Val Asp lie Gin Ala lie Lys Ser Glu Gly Gin Leu Glu Val 

20 25 30 



50 



Asn Gly Lys Arg Tyr Glu lie Arg Ala Ala Ala Asp Gly Ser lie Ala 

35 40 45 

Val Leu Arg Pro Asp Gin Gin Ser Lys Ala Asp Lys Phe Phe Lys Gly 
50 55 60 



55 



Ala Ala His Leu lie Gly Gly Gin Ser Gin Arg Ala Gin lie Ala Gin 
65 70 75 80 



Val Leu Asn Glu Lys Ala Ala Ala Val Pro Arg Leu Asp Arg Met Leu 

85 90 95 



62 



1^ X 



t *" 



Gly Arg Arg Phe Asp Leu Glu Lys Gly Gly Ser Ser Ala Val Gly Ala 

100 105 110 

Ala lie Lys Ala Ala Asp Ser Arg Leu Thr Ser Lys Gin Thr Phe Ala 

5 115 120 125 

Ser Phe Gin Gin Trp Ala Glu Lys Ala Glu Ala Leu Gly Arg Asp Thr 
130 135 140 

10 Glu lie Gly lie Tyr Met lie Tyr Lys Arg Asp Thr Pro Asp Thr Thr 
145 150 155 160 



15 



30 



45 



55 



Pro Met Asn Ala Ala Glu Gin Glu His Tyr Leu Glu Thr Leu Gin Ala 

165 170 175 

Leu Asp Asn Lys Lys Asn Leu lie lie Arg Pro Gin lie His Asp Asp 

180 185 190 



Arg Glu Glu Glu Glu Leu Asp Leu Gly Arg Tyr lie Ala Glu Asp Arg 
20 195 200 205 

Asn Ala Arg Thr Gly Phe Phe Arg Met Val Pro Lys Asp Gin Arg Ala 
210 215 220 

25 Pro Glu Thr Asn Ser Gly Arg Leu Thr lie Gly Val Glu Pro Lys Tyr 
225 230 235 240 



Gly Ala Gin Leu Ala Leu Ala Met Ala Thr Leu Met Asp Lys His Lys 

245 250 255 

Ser Val Thr Gin Gly Lys Val Val Gly Pro Ala Lys Tyr Gly Gin Gin 

260 265 270 



Thr Asp Ser Ala lie Leu Tyr lie Asn Gly Asp Leu Ala Lys Ala Val 

35 275 280 285 

Lys Leu Gly Glu Lys Leu Lys Lys Leu Ser Gly lie Pro Pro Glu Gly 

290 295 300 

40 Phe Val Glu His Thr Pro Leu Ser Met Gin Ser Thr Gly Leu Gly Leu 

305 310 315 320 



Ser Tyr Ala Glu Ser Val Glu Gly Gin Pro Ser Ser His Gly Gin Ala 

325 330 335 

Arg Thr His Val lie Met Asp Ala Leu Lys Gly Gin Gly Pro Met Glu 

340 345 350 



Asn Arg Leu Lys Met Ala Leu Ala Glu Arg Gly Tyr Asp Pro Glu Asn 
50 355 360 365 - 



Pro Ala Leu Arg Ala Arg Asn 

370 375 



A DNA molecule from Pseudomonas syringae pv. atrofaciens straiij, 
B143 encodes a homolog of HopPsyA and has a nucleotide sequence (SEQ. ID. 
No. 63) as follows: 

60 atgaacccga tacaaacgcg tttctctaac gtcgaagcac ttagacattc agaggtggat 60 
gtacaggagc tcaaagcaca cggtcaaata gaagtgggtg gcaaatgcta cgacattcgc 120 
gcggctgcca ataacgacct gactgtccag cgttctgaca aacagatggc gatgagcaag 180 
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tttttcaaaa 
gtactgaatg 
ctgggccgta 
cagaacagca 
aaggcgaaag 
ctcccgcgtg 
aagctgaacg 
ttgcagctaa 
ttctacaaga 
acgataaatg 
gctgagcgtc 
acagatgcag 
gagcttcagg 
tccatgggca 
atgtcgcgcg 
aagaagctgc 
ttggaatga 



aagcagggtt 
acaagcgcgg 
tgcaattcaa 
ggctgcccaa 
ccaatggcag 
tagaactgct 
gtaaggacgg 
aacatgacac 
tcgaacgttc 
tcaaacctca 
acgatatcat 
cggttttcta 
cactgctccc 
aggggctgtg 
ccagcataat 
gcaatgcttt 



aagtgggagt 
ctcttccgtt 
catcgaagag 
tggccgcttg 
cacaagtacc 
gccacgcact 
tatcagtatt 
aaaagtgttc 
gggcacgcaa 
attccagaag 
tactgccaaa 
tgtaagcgga 
tgacgatgcg 
ttacgccgag 
cgagtcggca 
caagagcgcc 



tccggcagtc 
ccccgtctta 
gggcaaggca 
gtaaacagca 
agtgctcttt 
gagcaccggg 
tggccgcagt 
atgatgaaca 
tttccggatg 
gccatggtcg 
gtggcaggtc 
gatttttccg 
tttatcaatc 
cgtacaccgc 
ctggcagaca 
ggatacaatc 



agtccgatca 
tacgccaggg 
gttcggccgc 
gtattttgca 
atcagatcta 
cgtgtctggc 
ttctggatgg 
accccaaagc 
aggctgtcaa 
acgcagcggt 
ctgcaaagat 
ctgcgcagac 
atacgccagc 
aggacaggac 
ccagcaggtc 
ccgacaaccc 



aattgcgcag 240 
gcagacccat 3 00 
cacgtccgtc 360 
atgggtcgaa 420 
cgcaaaagaa 4 80 
gcatatgtat 540 
cgtgcgcggg 600 

a 9 c 99 ac 9 a 9 660 
ggcgcgcctg 720 

caggttgacc 780 

tggcacgatt 840 

acttgcaaaa 900 

tggaatgcaa 960 

aagccacgga 1020 

gtcactggag 1080 

ggcattcagg 1140 

1149 



The encoded HopPsyA homolog has an amino acid sequence according to SEQ. ID 
No. 64 as follows: 



Met Asn Pro lie Gin Thr Arg Phe Ser Asn Val Glu Ala Leu Arg His 
1 5 10 15 

Ser Glu Val Asp Val Gin Glu Leu Lys Ala His Gly Gin lie Glu Val 

20 25 30 

Gly Gly Lys Cys Tyr Asp lie Arg Ala Ala Ala Asn Asn Asp Leu Thr 

35 40 45 

Val Gin Arg Ser Asp Lys Gin Met Ala Met Ser Lys Phe Phe Lys Lys 
50 55 60 

Ala Gly Leu Ser Gly Ser Ser Gly Ser Gin Ser Asp Gin lie Ala Gin 
65 70 75 80 

Val Leu Asn Asp Lys Arg Gly Ser Ser Val Pro Arg Leu lie Arg Gin 

85 90 95 

Gly Gin Thr His Leu Gly Arg Met Gin Phe Asn lie Glu Glu Gly Gin 

100 105 110 

Gly Ser Ser Ala Ala Thr Ser Val Gin Asn Ser Arg Leu Pro Asn Gly 

115 120 125 

Arg Leu Val Asn Ser Ser lie Leu Gin Trp Val Glu Lys Ala Lys Ala 
130 135 140 

Asn Gly Ser Thr Ser Thr Ser Ala Leu Tyr Gin lie Tyr Ala Lys Glu 
145 150 155 160 

Leu Pro Arg Val Glu Leu Leu Pro Arg Thr Glu His Arg Ala Cys Leu 

165 170 175 



Ala His Met Tyr Lys Leu Asn Gly Lys Asp Gly lie Ser lie Trp Pro 

180 185 190 

Gin Phe Leu Asp Gly Val Arg Gly Leu Gin Leu Lys His Asp Thr Lys 

195 200 205 



Val Phe Met Met Asn Asn Pro Lys Ala Ala Asp Glu Phe Tyr Lys lie 

210 215 220 
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Glu Arg Ser Gly Thr Gin Phe Pro Asp Glu Ala Val Lys Ala Arg Leu 
225 230 235 240 

5 Thr lie Asn Val Lys Pro Gin Phe Gin Lys Ala Met Val Asp Ala Ala 

245 250 255 



10 



25 



Val Arg Leu Thr Ala Glu Arg His Asp lie lie Thr Ala Lys Val Ala 

260 265 270 

Gly Pro Ala Lys lie Gly Thr lie Thr Asp Ala Ala Val Phe Tyr Val 

275 280 285 



Ser Gly Asp Phe Ser Ala Ala Gin Thr Leu Ala Lys Glu Leu Gin Ala 
15 290 295 300 

Leu Leu Pro Asp Asp Ala Phe lie Asn His Thr Pro Ala Gly Met Gin 
305 310 315 320 

20 Ser Met Gly Lys Gly Leu Cys Tyr Ala Glu Arg Thr Pro Gin Asp Arg 

325 330 335 



Thr Ser His Gly Met Ser Arg Ala Ser lie lie Glu Ser Ala Leu Ala 

340 345 350 

Asp Thr Ser Arg Ser Ser Leu Glu Lys Lys Leu Arg Asn Ala Phe Lys 

355 360 365 



Ser Ala Gly Tyr Asn Pro Asp Asn Pro Ala Phe Arg Leu Glu 
30 370 375 380 



A DNA molecule from Pseudomonas syringae pv. tomato strain 
DC3000 encodes a homo log of HopPtoA, identified herein as HopPtoA2, and has a 
35 nucleotide sequence (SEQ. ID. No. 65) as follows: 



atgcacatca accaatccgc ccaacaaccg cctggcgttg caatggagag ttttcggaca 60 
gcttccgacg cgtcccttgc ttcgagttct gtgcggtctg tcagcactac ctcgtgccgc 120 
gatctacaag ctattaccga ttatctgaaa catcacgtgt tcgctgcgca caggttttcg 180 

40 gtaataggct caccggatga gcgtgatgcc gctcttgcac acaacgagca gatcgatgcg 240 
ttggtagaga cacgcgccaa ccgcctgtac tccgaagggg agacccccgc aaccatcgcc 3 00 
gaaacattcg ccaaggcgga aaagttcgac cgtttggcga cgaccgcatc aagtgctttt 3 60 
gagaacacgc catttgccgc tgcctcggtg cttcagtaca tgcagcctgc gatcaacaag 420 
ggcgattggc tagcaacgcc gctcaagccg ctgaccccgc tcatttccgg agcgctgtcg 4 80 

45 ggagccatgg accaggtggg caccaaaatg atggatcgtg cgaggggtga tctgcattac 540. 
ctgagcactt cgccggacaa gttgcatgat gcgatggccg tatcggtgaa gcgccactcg 600 
cctgcgcttg gtcgacaggt tgtggacatg gggattgcag tgcagacgtt ctcggcgcta 660 
aatgtggtgc gtaccgtatt ggctccagca ctagcgtcca gaccgtcggt gcagggtgct 720 
gttgattttg gcgtatctac ggcgggtggc ttggttgcga atgcaggctt tggcgaccgc 780 

50 atgctcagtg tgcaatcgcg cgatcaactg cgtggggggg cattcgtact tggcatgaaa 84 0 
gataaagagc ccaaggccgc gttgagtgaa gaaactgatt ggcttgatgc ttacaaagcg 900 
atcaagtcgg ccagctactc aggtgcggcg ctcaatgcgg gcaagcggat ggccggcctg 960 
ccactggacg tcgcgaccga cgggctcaag gcggtgagaa gtctggtgtc ggccaccagc 102 0 
ctgacaaaaa atggcctggc cctagccggt ggttacgccg gggtaagtaa gttgcagaaa 1080 

55 atggcgacga aaaatatcac tgattcggcg accaaggctg cggttagtca gctgagcaac 1140 
ctggtgggtt cggtaggcgt tttcgcaggc tggaccaccg ctggactggc gactgaccct 1200 
gcggttaaga aagccgagtc gtttatacag gataaggtga aatcgaccgc atctagtacc 1260 
acaagctatg ttgccgacca gaccgtcaaa ctggcgaaaa cagtcaagga catgagcggg 1320 
gaggcgatct ccagcaccgg tgccagctta cgcagtactg tcaataacct gcgtcatcgc 1380 

60 tccgctccgg aagctgatat cgaagaaggt gggatttcgg cgttttctcg aagtgaaaca 144 0 
ccgtttcagc tcaggcgttt gtaa 1464 
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Although hopPtoAl does not lie within the CEL 5 it is included here as a homolog of 
hopPtoA, which corresponds to CEL ORF5 as noted above. The encoded 
HopPtoA2 protein or polypeptide has an amino acid sequence according to SEQ. 
ID. No. 66 as follows: 



Met His lie Asn Gin Ser Ala Gin Gin Pro Pro Gly Val Ala Met Glu 
15 10 15 

Ser Phe Arg Thr Ala Ser Asp Ala Ser Leu Ala Ser Ser Ser Val Arg 

20 25 30 

Ser Val Ser Thr Thr Ser Cys Arg Asp Leu Gin Ala lie Thr Asp Tyr 

35 40 45 

Leu Lys His His Val Phe Ala Ala His Arg Phe Ser Val lie Gly Ser 
50 55 60 

Pro Asp Glu Arg Asp Ala Ala Leu Ala His Asn Glu Gin lie Asp Ala 
65 70 75 80 

Leu Val Glu Thr Arg Ala Asn Arg Leu Tyr Ser Glu Gly Glu Thr Pro 

85 90 95 

Ala Thr lie Ala Glu Thr Phe Ala Lys Ala Glu Lys Phe Asp Arg Leu 

100 105 110 

Ala Thr Thr Ala Ser Ser Ala Phe Glu Asn Thr Pro Phe Ala Ala Ala 

115 120 125 

Ser Val Leu Gin Tyr Met Gin Pro Ala lie Asn Lys Gly Asp Trp Leu 
130 135 140 

Ala Thr Pro Leu Lys Pro Leu Thr Pro Leu lie Ser Gly Ala Leu Ser 
145 150 155 160 

Gly Ala Met Asp Gin Val Gly Thr Lys Met Met Asp Arg Ala Arg Gly 

165 170 175 

Asp Leu His Tyr Leu Ser Thr Ser Pro Asp Lys Leu His Asp Ala Met 

180 185 190 

Ala Val Ser Val Lys Arg His Ser Pro Ala Leu Gly Arg Gin Val Val 

195 200 205 

Asp Met Gly lie Ala Val Gin Thr Phe Ser Ala Leu Asn Val Val Arg 
210 215 220 

Thr Val Leu Ala Pro Ala Leu Ala Ser Arg Pro Ser Val Gin Gly Ala 
225 230 235 240 

Val Asp Phe Gly Val Ser Thr Ala Gly Gly Leu Val Ala Asn Ala Gly 

245 250 255 

Phe Gly Asp Arg Met Leu Ser Val Gin Ser Arg Asp Gin Leu Arg Gly 

260 265 270 

Gly Ala Phe Val Leu Gly Met Lys Asp Lys Glu Pro Lys Ala Ala Leu 

275 280 285 



Ser Glu Glu Thr Asp Trp Leu Asp Ala Tyr Lys Ala lie Lys Ser Ala 
290 295 300 



IB 
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£ - : 



-66- 

Ser Tyr Ser Gly Ala Ala Leu Asn Ala Gly Lys Arg Met Ala Gly Leu 
305 310 315 320 

Pro Leu Asp Val Ala Thr Asp Gly Leu Lys Ala Val Arg Ser Leu Val 
5 325 330 335 

Ser Ala Thr Ser Leu Thr Lys Asn Gly Leu Ala Leu Ala Gly Gly Tyr 

340 345 350 

10 Ala Gly Val Ser Lys Leu Gin Lys Met Ala Thr Lys Asn lie Thr Asp 

355 360 365 



15 



Ser Ala Thr Lys Ala Ala Val Ser Gin Leu Ser Asn Leu Val Gly Ser 

370 375 380 

Val Gly Val Phe Ala Gly Trp Thr Thr Ala Gly Leu Ala Thr Asp Pro 

385 390 395 400 



Ala Val Lys Lys Ala Glu Ser Phe lie Gin Asp Lys Val Lys Ser Thr 
20 405 410 415 

Ala Ser Ser Thr Thr Ser Tyr Val Ala Asp Gin Thr Val Lys Leu Ala 

420 425 430 

25 Lys Thr Val Lys Asp Met Ser Gly Glu Ala lie Ser Ser Thr Gly Ala 

435 440 445 



't- Ser Leu Arg Ser Thr Val Asn Asn Leu Arg His Arg Ser Ala Pro Glu 

^ 450 455 460 
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Ala Asp lie Glu Glu Gly Gly lie Ser Ala Phe Ser Arg Ser Glu Thr 
465 470 475 480 



Pro Phe Gin Leu Arg Arg Leu 
35 485 



Fragments of the above-identified proteins or polypeptides as well as 
fragments of full length proteins from the EELs and CELs of other bacteria, in 
40 particular Gram-negative pathogens, can also be used according to the present 



invention. 



Suitable fragments can be produced by several means. Subclones of 



the gene encoding a known protein can be produced using conventional molecular 

genetic manipulation for subcloning gene fragments, such as described by Sambrook 
45 et al., 1989, and Ausubel et al., 1994. The subclones then are expressed in vitro or in 

vivo in bacterial cells to yield a smaller protein or polypeptide that can be tested for 

activity, e.g., as a product required for pathogen virulence. 

In another approach, based on knowledge of the primary structure of 

the protein, fragments of the protein-coding gene may be synthesized using the PCR 
50 technique together with specific sets of primers chosen to represent particular portions 

of the protein (Erlich et al., 1991). These can then be cloned into an appropriate 
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vector for expression of a truncated protein or polypeptide from bacterial cells as 
described above. 

As an alternative, fragments of a protein can be produced by digestion 
of a full-length protein with proteolytic enzymes like chymotrypsin or Staphylococcus 
proteinase A, or trypsin. Different proteolytic enzymes are likely to cleave different 
proteins at different sites based on the amino acid sequence of the particular protein. 
Some of the fragments that result from proteolysis may be active virulence proteins or 
polypeptides. 

Chemical synthesis can also be used to make suitable fragments. Such 
a synthesis is carried out using known amino acid sequences for the polyppetide being 
produced. Alternatively, subjecting a full length protein to high temperatures and 
pressures will produce fragments. These fragments can then be separated by 
conventional procedures (e.g., chromatography, SDS-PAGE). 

Variants may also (or alternatively) be modified by, for example, the 
deletion or addition of amino acids that have minimal influence on the properties, 
secondary structure and hydropathic nature of the polypeptide. For example, a 
polypeptide may be conjugated to a signal (or leader) sequence at the N-terminal end 
of the protein which co-translationally or post-translationally directs transfer of the 
protein. The polypeptide may also be conjugated to a linker or other sequence for 
ease of synthesis, purification, or identification of the polypeptide. 

The proteins or polypeptides used in accordance with the present 
invention are preferably produced in purified form (preferably at least about 80%, 
more preferably 90%, pure) by conventional techniques. Typically, the protein or 
polypeptide of the present invention is secreted into the growth medium of 
recombinant host cells (discussed infra). Alternatively, the protein or polypeptide of 
the present invention is produced but not secreted into growth medium. In such cases, 
to isolate the protein, the host cell (e.g., E. coli) carrying a recombinant plasmid is 
propagated, lysed by sonication, heat, or chemical treatment, and the homogenate is 
centrifuged to remove bacterial debris. The supernatant is then subjected to 
sequential ammonium sulfate precipitation. The fraction containing the protein or 
polypeptide of interest is subjected to gel filtration in an appropriately sized dextran 



* 
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or polyacrylamide column to separate the proteins. If necessary, the protein fraction 
may be further purified by HPLC. 

DNA molecules encoding other EEL and CEL protein or polypeptides 
can be identified using a PCR-based methodology for cloning portions of the 
5 pathogenicity islands of a bacterium. Basically, the PCR-based strategy involves the 
use of conserved sequences from the hrpK and tRNA leu genes (or other conserved 
border sequences) as primers for cloning EEL intervening regions of the 
pathogenicity island. As shown in Figures 2B-C, the hrpK and tRNA leu genes are 
highly conserved among diverse Pseudomonas syringae variants. Depending upon 
10 the size of EEL, additional primers can be prepared from the originally obtained 

cDNA sequence, allowing for recovery of clones and walking through the EEL in a 
step-wise fashion. If full-length coding sequences are not obtained from the PGR 
steps, contigs can be assembled to prepare full-length coding sequences using suitable 
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restriction enzymes. Similar PCR-based procedures can be used for obtaining clones 
1 5 that encode open reading frames in the CEL. As shown in Figure 3, the CEL of 

diverse Pseudomonas syringae pathovars contain numerous conserved domains. 

Moreover, known sequences of the hrp/hrc domain, hrpW, AvrE 9 or gstA can be used 

to prepare primers. 

Using the above-described PCR-based methods, a number of DNA 
20 sequences were utilized as the source for primers. One such DNA molecule is 

isolated from the tRNA leu gene of Pseudomonas syringae pv. tomato DC3000, which 

has a nucleotide sequence (SEQ. ID. No. 67) as follows: 



gccctgatgg cggaattggt agacgcggcg gattcaaaat ccgttttcga aagaagtggg 60 
25 agttcgattc tccctcgggg caccacca 88 

An additional DNA molecule which can be used to supply suitable primers is from the 
tRNA leu gene of Pseudomonas syringae pv. syringae B728a, which has a nucleotide 



sequence (SEQ. ID. No. 68) as follows: 



gccctgatgg cggaattggt agacgcggcg gattcaaaat ccgttttcga aagaagtggg 60 
agttcgattc tccctcgggg caeca 85 

Another DNA molecule is isolated from the queA gene of Pseudomonas syringae pv. 
35 tomato DC3000, which has a nucleotide sequence (SEQ. ED. No. 69) as follows: 
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atgcgcgtcg 
gccgagcgtc 
cgtcaattca 
acccgtgtca 
ctggtcgagc 
ccaaagccgg 
catgacgcgc 
ggccatatgc 
tatcagaccg 
ttcgaccagc 
ctgcacgtcg 
atgcacagcg 
gcgcggggcg 
gcgcgtgatg 
cggccgtttc 
ttgatgctgg 
atcgaacacg 
gcgccgacgg 



ctgactttac 
gcagcagtcg 
ccgatttgct 
ttcccgcacg 
gcgtgctgga 
gctcgtcgat 
tgttcgagtt 
cgttgcctcc 
tttacgccca 
cgttgatgga 
gcgcgggtac 
aatggctgga 
ggcgggtgat 
gccagttgaa 
atgtggtcga 
tttcggcgtt 
ggtaccgctt 
ccccacagga 



cttcgaactc 
tctgttgacc 
cgagcatttg 
tttgttcggg 
cagccatcgt 
cctgatcgat 
gcgctttgcc 
ttatatagac 
gcgcgccggt 
agcaattgcc 
gttccagccg 
agtcagccag 
tgcggtcggg 
gccgtttagc 
tgccctggtg 
cgccggttat 
cttcagttac 
atcggcacca 



cccgattccc 
cttgatgggc 
cgctcgggcg 
cagaaggcgt 
gtgctggcgc 
ggcggcggcg 
gaagaagtgc 
cgcccggacg 
gctgtggcgg 
gccaagggcg 
gtgcgtgtcg 
gacgtggtcg 
accaccagcg 
ggcgacaccg 
actaattttc 
cccgaaacca 
ggtgatgcca 
gaggatcacg 



tgattgctcg 
cgacgggcgc 
acttgatggt 
ccggcggcaa 
acgtgcgtgc 
aggccgagat 
tgccgttgct 
aaggtgccga 
cgccgactgc 
tcgagactgc 
agcagatcga 
atgccgtggc 
tgcgttcgct 
acatcttcat 
atttgcctga 
tggcggccta 
tgttcatcac 
catga 



tcacccgttg 60 
gctggcacat 120 
gttcaacaat 180 
gctggagatt 240 
cagcaagtcg 3 00 
ggtggcgcgg 3 60 
ggatcgtgtc 420 
ccgcgagcgt 4 80 
cggcctgcat 540 
ttttgtcact 600 
agatcaccac 660 
ggcgtgccgt 720 
ggagagtgcc 780 
ctatccgggg 840 
atccacgctg .900 
cgcggcggcc 960 
ccgcaatccc 1020 
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This DNA molecule encodes Que A, which has an amino acid sequence (SEQ. ID. No 
70) as follows: 



Met Arg Val Ala Asp Phe Thr Phe Glu Leu Pro Asp Ser Leu lie Ala 
15 10 15 

Arg His Pro Leu Ala Glu Arg Arg Ser Ser Arg Leu Leu Thr Leu Asp 

20 25 30 

Gly Pro Thr Gly Ala Leu Ala His Arg Gin Phe Thr Asp Leu Leu Glu 

35 40 45 

His Leu Arg Ser Gly Asp Leu Met Val Phe Asn Asn Thr Arg Val lie 
50 55 60 

Pro Ala Arg Leu Phe Gly Gin Lys Ala Ser Gly Gly Lys Leu Glu lie 
65 70 75 80 

Leu Val Glu Arg Val Leu Asp Ser His Arg Val Leu Ala His Val Arg 

85 90 95 

Ala Ser Lys Ser Pro Lys Pro Gly Ser Ser lie Leu lie Asp Gly Gly 

100 105 110 

Gly Glu Ala Glu Met Val Ala Arg His Asp Ala Leu Phe Glu Leu Arg 

115 120 125 

Phe Ala Glu Glu Val Leu Pro Leu Leu Asp Arg Val Gly His Met Pro 
130 135 140 

Leu Pro Pro Tyr lie Asp Arg Pro Asp Glu Gly Ala Asp Arg Glu Arg 
145 150 155 160 

Tyr Gin Thr Val Tyr Ala Gin Arg Ala Gly Ala Val Ala Ala Pro Thr 

165 170 175 



Ala Gly Leu His Phe Asp Gin Pro Leu Met Glu Ala lie Ala Ala Lys 

180 185 190 



Gly Val Glu Thr Ala Phe Val Thr Leu His Val Gly Ala Gly Thr Phe 

195 200 205 
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Gln Pro Val Arg Val Glu Gin lie Glu Asp His His Met His Ser Glu 
210 215 220 

Trp Leu Glu Val Ser Gin Asp Val Val Asp Ala Val Ala Ala Cys Arg 
5 225 230 235 240 

Ala Arg Gly Gly Arg Val lie Ala Val Gly Thr Thr Ser Val Arg Ser 

245 250 255 

10 Leu Glu Ser Ala Ala Arg Asp Gly Gin Leu Lys Pro Phe Ser Gly Asp 

260 265 270 



15 



Thr Asp lie Phe lie Tyr Pro Gly Arg Pro Phe His Val Val Asp Ala 

275 280 285 

Leu Val Thr Asn Phe His Leu Pro Glu Ser Thr Leu Leu Met Leu Val 
290 295 300 



Ser Ala Phe Ala Gly Tyr Pro Glu Thr Met Ala Ala Tyr Ala Ala Ala 
20 305 310 315 320 

lie Glu His Gly Tyr Arg Phe Phe Ser Tyr Gly Asp Ala Met Phe lie 

325 330 335 

25 Thr Arg Asn Pro Ala Pro Thr Ala Pro Gin Glu Ser Ala Pro Glu Asp 

340 345 350 
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His Ala 



can 




ulhu EEL and CEL pr 



also be id^ified^fc determining whether such DNA molecules hybridize 



er 



stringent conditions to k DNA molecule as identified above. An exampl^of suitable 
35 stringency conditions is when hybridization is carried out at a temperature of about 

37°C using a hybridization medium that includes 0.9M sodium citrate ("SSC") buffer, 



followed by washing with 0.2x SSC buffer at 37°C. 



er stringency can readily be 



attained by increasing the temperature for either Jy^Dridization or washing conditions 



or increasing the sodium concentration of 



ybridization or wash medium. 



40 Nonspecific binding may also be contained using any one of a number of known 
techniques such as, for exampl^lrfocking the membrane with protein-containing 
solutions, addition of heterologous RNA, DNA, and SDS to the hybridization buffer, 



and treatment with 



se. Wash conditions are typically performed at or below 



stringency. Exenlplary high stringency conditions include carrying out hybridization 
45 at a temperature of about 42°C to about 65°C for up to about 20 hours in a 

hybridisation medium containing 1M NaCl, 50 mM Tris-HCl, pH 7.4, 10 mM EDTA, 
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Also encompassed by the present invention are nucleic acid molecules 



which contain conserved substitutions as compared to the above identified DNA 
molecules and, thus, encode the same protein or polypeptides identified above. 
Further, complementary sequences are also encompassed by the present invention. 

The nucleic acid of the present invention can be either DNA or RNA, 
which can readily be prepared using the above identified DNA molecules of the 
present invention. 

The delivery of effector proteins or polypeptides can be achieved in 
several ways, depending upon the host being treated and the materials being used: (1) 
as a stable or plasmid-encoded transgene; (2) transiently expressed via Agrobacterium 
or viral vectors; (3) delivered by the type III secretion systems of disarmed pathogens 
or recombinant nonpathogenic bacteria which express a functional, heterologous type 
III secretion system; or (4) delivered via topical application followed by TAT protein 
transduction domain-mediated spontaneous uptake into cells. Each of these is 
discussed infra: 

The DNA molecule encoding the protein or polypeptide can be 
incorporated in cells using conventional recombinant DNA technology. Generally, 
this involves inserting the DNA molecule into an expression system to which the 
DNA molecule is heterologous (i.e. not normally present). The heterologous DNA 
molecule is inserted into the expression system or vector in proper sense orientation 
and correct reading frame. The vector contains the necessary elements for the 
transcription and translation of the inserted protein-coding sequences. 

U.S. Patent No. 4,237,224 to Cohen and Boyer describes the 
production of expression systems in the form of recombinant plasmids using 
restriction enzyme cleavage and ligation with DNA ligase. These recombinant 
plasmids are then introduced by means of transformation and replicated in unicellular 
cultures including prokaryotic organisms and eukaryotic cells grown in tissue culture. 

Recombinant genes may also be introduced into viruses, such as 
vaccina virus. Recombinant viruses can be generated by transfection of plasmids into 
cells infected with virus. 
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Suitable vectors include, but are not limited to, the following viral 
vectors such as lambda vector system gtl 1, gt WES.tB, Charon 4, and plasmid vectors 
such as pBR322, pBR325, pACYC177, pACYC1084, pUC8, pUC9, pUC18, pUC19, 
pLG339, pR290, pKC37, pKClOl, SV 40, pBluescript II SK +/- or KS +/- (see 
5 "Stratagene Cloning Systems" Catalog (1993) from Stratagene, La Jolla, Calif, which 
is hereby incorporated by reference), pQE, pIH821, pGEX, pET series (see Studier et 
al., 1990). Recombinant molecules can be introduced into cells via transformation, 
particularly transduction, conjugation, mobilization, or electroporation. The DNA 
sequences are cloned into the vector using standard cloning procedures in the art, as 
10 described by Sambrook et al., 1989. 

A variety of host-vector systems may be utilized to express the protein- 
encoding sequence(s). Primarily, the vector system must be compatible with the host 
cell used. Host-vector systems include, but are not limited to, the following: bacteria 
transformed with bacteriophage DNA, plasmid DNA, or cosmid DNA; 
1 5 microorganisms such as yeast containing yeast vectors; mammalian cell systems 

infected with virus (e.g., vaccinia virus, adenovirus, etc.); insect cell systems infected 
with virus (e.g., baculovirus); and plant cells infected by bacteria. The expression 
elements of these vectors vary in their strength and specificities. Depending upon the 
host- vector system utilized, any one of a number of suitable transcription and 
20 translation elements can be used. 

Different genetic signals and processing events control many levels of 
gene expression (e.g., DNA transcription and messenger RNA (mRNA) translation). 

Transcription of DNA is dependent upon the presence of a promoter 
which is a DNA sequence that directs the binding of RNA polymerase and thereby 
25 promotes mRNA synthesis. The DNA sequences of eukaryotic promoters differ from 
those of prokaryotic promoters. Eukaryotic promoters and accompanying genetic 
signals may not be recognized in or may not function in a prokaryotic system and, 
further, prokaryotic promoters are not recognized and do not function in eukaryotic 
cells. 

30 Similarly, translation of mRNA in prokaryotes depends upon the 

presence of the proper prokaryotic signals which differ from those of eukaryotes. 
Efficient translation of mRNA in prokaryotes requires a ribosome binding site called 
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the Shine-Dalgamo ("SD") sequence on the mRNA. This sequence is a short 
nucleotide sequence of mRNA that is located before the start codon, usually AUG, 
which encodes the amino-termihal methionine of the protein. The SD sequences are 
complementary to the 3'-end of the 16S rRNA (ribosomal RNA) and probably 
promote binding of mRNA to ribosomes by duplexing with the rRNA to allow correct 
positioning of the ribosome. For a review on maximizing gene expression, see 
Roberts and Lauer, 1 979. 

Promoters vary in their "strength" (i.e., their ability to promote 
transcription). For the purposes of expressing a cloned gene, it is desirable to use 
strong promoters in order to obtain a high level of transcription and, hence, expression 
of the gene. Depending upon the host cell system utilized, any one of a number of 
suitable promoters may be used. For instance, when cloning in E. coli, its 
bacteriophages, or plasmids, promoters such as the T7 phage promoter, lac promoter, 
trp promoter, recA promoter, ribosomal RNA promoter, the Pr and Pl promoters of 
coliphage lambda and others, including but not limited, to lacUVS, ompF, bla, Ipp, 
and the like, may be used to direct high levels of transcription of adjacent DNA 
segments. Additionally, a hybrid trp-lac\JW5 (tac) promoter or other E. coli 
promoters produced by recombinant DNA or other synthetic DNA techniques may be 
used to provide for transcription of the inserted gene. 

Bacterial host cell strains and expression vectors may be chosen which 
inhibit the action of the promoter unless specifically induced. In certain operations, 
the addition of specific inducers is necessary for efficient transcription of the inserted 
DNA. For example, the lac operon is induced by the addition of lactose or EPTG 
(isopropylthio-beta-D-galactoside). A variety of other operons, such as trp, pro, etc., 
are under different controls. 

Specific initiation signals are also required for efficient gene 
transcription and translation in prokaryotic cells. These transcription and translation 
initiation signals may vary in "strength" as measured by the quantity of gene specific 
messenger RNA and protein synthesized, respectively. The DNA expression vector, 
which contains a promoter, may also contain any combination of various "strong" 
transcription and/or translation initiation signals. For instance, efficient translation in 
E. coli requires an SD sequence about 7-9 bases 5' to the initiation codon ("ATG") to 
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provide a ribosome binding site. Thus, any SD-ATG combination that can be utilized 
by host cell ribosomes may be employed. Such combinations include but are not 
limited to the SD-ATG combination from the cro gene or the N gene of coliphage 
lambda, or from the E. coli tryptophan E, D, C, B or A genes. Additionally, any SD- 
ATG combination produced by recombinant DNA or other techniques involving 
incorporation of synthetic nucleotides may be used. 

Once the isolated DNA molecule encoding the polypeptide or protein 
has been cloned into an expression system, it is ready to be incorporated into a host 
cell. Such incorporation can be carried out by the various forms of transformation 
noted above, depending upon the vector/host cell system. Suitable host cells include, 
but are not limited to, bacteria, virus, yeast, mammalian cells, insect, plant, and the 
like. 

Because it is desirable for recombinant host cells to secrete the 
encoded protein or polypeptide, it is preferable that the host cell also possess a 
functional type III secretion system. The type III secretion system can be 
heterologous to host cell (Ham et al., 1998) or the host cell can naturally possess a 
type III secretion system. Host cells which naturally contain a type III secretion 
system include many pathogenic Gram-negative bacterium, such as numerous 
Erwinia species, Pseudomonas species, Xanthomonas species, etc. Other type III 
secretion systems are known and still others are continually being identified. 
Pathogenic bacteria that can be utilized to deliver effector proteins or polypeptides are 
preferably disarmed according to known techniques, i.e., as described above. 
Alternatively, isolation of the effector protein or polypeptide from the host cell or 
growth medium can be carried out as described above. 

Another aspect of the present invention relates to a transgenic plant 
which express a protein or polypeptide of the present invention and methods of 
making the same. 

In order to express the DNA molecule in isolated plant cells or tissue 
or whole plants, a plant expressible promoter is needed. Any plant-expressible 
promoter can be utilized regardless of its origin, i.e., viral, bacterial, plant, etc. 
Without limitation, two suitable promoters include the nopaline synthase promoter 
(Fraley et al., 1983) and the cauliflower mosaic virus 35S promoter (O'Dell et al., 
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1985). Both of these promoters yield constitutive expression of coding sequences 
under their regulatory control. 

While constitutive expression is generally suitable for expression of 
the DNA molecule, it should be apparent to those of skill in the art that temporally or 
tissue regulated expression may also be desirable, in which case any regulated 
promoter can be selected to achieve the desired expression. Typically, the temporally 
or tissue regulated promoters will be used in connection with the DNA molecule that 
are expressed at only certain stages of development or only in certain tissues. 

In some plants, it may also be desirable to use promoters which are 
responsive to pathogen infiltration or stress. For example, it may be desirable to limit 
expression of the protein or polypeptide in response to infection by a particular 
pathogen of the plant. One example of a pathogen-inducible promoter is the gstl 
promoter from potato, which is described in U.S. Patent Nos. 5,750,874 and 
5,723,760 to Strittmayer et al., which are hereby incorporated by reference. 

Expression of the DNA molecule in isolated plant cells or tissue or 
whole plants also requires appropriate transcription termination and polyadenylation 
of mRNA. Any 3 5 regulatory region suitable for use in plant cells or tissue can be 
operably linked to the first and second DNA molecules. A number of 3' regulatory 
regions are known to be operable in plants. Exemplary 3' regulatory regions include, 
without limitation, the nopaline synthase 3' regulatory region (Fraley et al., 1983) and 
the cauliflower mosaic virus 3' regulatory region (Odell et al., 1985). 

The promoter and a 3' regulatory region can readily be li gated to the 
DNA molecule using well known molecular cloning techniques described in 
Sambrook et al., 1989. 

One approach to transforming plant cells with a DNA molecule of the 
present invention is particle bombardment (also known as biolistic transformation) of 
the host cell. This can be accomplished in one of several ways. The first involves 
propelling inert or biologically active particles at cells. This technique is disclosed in 
U.S. Patent Nos. 4,945,050, 5,036,006, and 5,100,792, all to Sanford, et al. 
Generally, this procedure involves propelling inert or biologically active particles at 
the cells under conditions effective to penetrate the outer surface of the cell and to be 
incorporated within the interior thereof. When inert particles are utilized, the vector 
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can be introduced into the cell by coating the particles with the vector containing the 
heterologous DNA. Alternatively, the target cell can be surrounded by the vector so 
that the vector is carried into the cell by the wake of the particle. Biologically active 
particles (e.g., dried bacterial cells containing the vector and heterologous DNA) can 
also be propelled into plant cells. Other variations of particle bombardment, now 
known or hereafter developed, can also be used. 

Another method of introducing the DNA molecule into plant cells is 
fusion of protoplasts with other entities, either minicells, cells, lysosomes, or other 
fusible lipid-surfaced bodies that contain the DNA molecule (Fraley et al., 1982). 

The DNA molecule may also be introduced into the plant cells by 
electroporation (Fromm, et aL, 1985). In this technique, plant protoplasts are 
electroporated in the presence of plasmids containing the DNA molecule. Electrical 
impulses of high field strength reversibly permeabilize biomembranes allowing the 
introduction of the plasmids. Electroporated plant protoplasts reform the cell wall, 
divide, and regenerate. 

Another method of introducing the DNA molecule into plant cells is to 
infect a plant cell with Agrobacterium tumefaciens or Agrobacterium rhizogenes 
previously transformed with the DNA molecule. Under appropriate conditions known 
in the art, the transformed plant cells are grown to form shoots or roots, and develop 
further into plants. Generally, this procedure involves inoculating the plant tissue 
with a suspension of bacteria and incubating the tissue for 48 to 72 hours on 
regeneration medium without antibiotics at 25-28°C. 

Agrobacterium is a representative genus of the Gram-negative family 
Rhizobiaceae. Its species are responsible for crown gall {A. tumefaciens) and hairy 
root disease {A. rhizogenes). The plant cells in crown gall tumors and hairy roots are 
induced to produce amino acid derivatives known as opines, which are catabolized 
only by the bacteria. The bacterial genes responsible for expression of opines are a 
convenient source of control elements for chimeric expression cassettes. In addition, 
assaying for the presence of opines can be used to identify transformed tissue. 

Heterologous genetic sequences such as a DNA molecule of the 
present invention can be introduced into appropriate plant cells by means of the Ti 
plasmid of A. tumefaciens or the Ri plasmid of A. rhizogenes. The Ti or Ri plasmid 
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is transmitted to plant cells on infection by Agrobacterium and is stably integrated 
into the plant genome (Schell, 1987). 

Plant tissue suitable for transformation include leaf tissue, root tissue, 
meristems, zygotic and somatic embryos, and anthers. 
5 After transformation, the transformed plant cells can be selected and 

regenerated. 

Preferably, transformed cells are first identified using, e.g., a selection 
marker simultaneously introduced into the host cells along with the DNA molecule of 
the present invention. Suitable selection markers include, without limitation, markers 

10 coding for antibiotic resistance, such as kanamycin resistance (Fraley et al., 1983). A 
number of antibiotic-resistance markers are known in the art and other are continually 
being identified. Any known antibiotic-resistance marker can be used to transform 
and select transformed host cells in accordance with the present invention. Cells or 
tissues are grown on a selection media containing an antibiotic, whereby generally 

15 only those transformants expressing the antibiotic resistance marker continue to grow. 

Once a recombinant plant cell or tissue has been obtained, it is possible 
to regenerate a full-grown plant therefrom. Thus, another aspect of the present 
invention relates to a transgenic plant that includes a DNA molecule of the present 
invention, wherein the promoter induces transcription of the first DNA molecule in 

20 response to infection of the plant by an oomycete. Preferably, the DNA molecule is 
stably inserted into the genome of the transgenic plant of the present invention. 

Plant regeneration from cultured protoplasts is described in Evans et 
al., 1983, and Vasil, 1984 and 1986. 

It is known that practically all plants can be regenerated from cultured 

25 cells or tissues, including but not limited to, all major species of rice, wheat, barley, 
rye, cotton, sunflower, peanut, com, potato, sweet potato, bean, pea, chicory, lettuce, 
endive, cabbage, cauliflower, broccoli, turnip, radish, spinach, onion, garlic, eggplant, 
pepper, celery, carrot, squash, pumpkin, zucchini, cucumber, apple, pear, melon, 
strawberry, grape, raspberry, pineapple, soybean, tobacco, tomato, sorghum, and 

30 sugarcane. 

Means for regeneration vary from species to species of plants, but 
generally a suspension of transformed protoplasts or a petri plate containing 
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transformed explants is first provided. Callus tissue is formed and shoots may be 
induced from callus and subsequently rooted. Alternatively, embryo formation can be 
induced in the callus tissue. These embryos germinate as natural embryos to form 
plants. The culture media will generally contain various amino acids and hormones, 
5 such as auxin and cytokinins. It is also advantageous to add glutamic acid and proline 
to the medium, especially for such species as corn and alfalfa. Efficient regeneration 
will depend on the medium, on the genotype, and on the history of the culture. If 
these three variables are controlled, then regeneration is usually reproducible and 
repeatable. 

10 After the DNA molecule is stably incorporated in transgenic plants, it 

can be transferred to other plants by sexual crossing or by preparing cultivars. With 
respect to sexual crossing, any of a number of standard breeding techniques can be 
used depending upon the species to be crossed. Cultivars can be propagated in accord 
with common agricultural procedures known to those in the field. 

15 Diseases caused by the vast majority of bacterial pathogens result in . 

limited lesions. That is, even when everything is working in the pathogen's favor 
(e.g.,; no triggering of the hypersensitive response because of i?-gene detection of one 
of the effectors), the parasitic process still triggers defenses after a couple of days, 
which then stops the infection from spreading. Thus, the very same effectors that 

20 enable parasitism to proceed must also eventually trigger defenses. Therefore, 

premature expression of these effectors is believed to "turn on" plant defenses earlier 
(i.e., prior to infection) and make the plant resistant to either the specific bacteria from 
which the effector protein was obtained or many pathogens. An advantage of this 
approach is that it involves natural products and plants seem highly sensitive to 

25 pathogen effector proteins. 

According to one embodiment, a transgenic plant is provided that 
contains a heterologous DNA molecule of the present invention. Preferably, the 
heterologous DNA molecule is derived from a plant pathogen EEL. When the 
heterologous DNA molecule is expressed in the transgenic plant, plant defenses are 

30 activated, imparting disease resistance to the transgenic plant. The transgenic plant 

can also contain an /?-gene which is activated by the protein or polypeptide product of 
the heterologous DNA molecule. The R gene can be naturally occurring in the plant 
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or heterologously inserted therein. A number of R genes have been identified in 
various plant species, including without limitation: RPS2, RPMI, and RPP5 from 
Arabidopsis ihaliana; Cf2, CfP, 12, Pto, and Prf from tomato; N from tobacco; L6 and 
M from flax; Xa2l from rice; and Hslpro-1 from sugar beet. In addition to imparting 
5 disease resistance, it is believed that stimulation of plant defenses in transgenic plants 
of the present invention will also result in a simultaneous enhancement in growth and 
resistance to insects. 

According to another embodiment, a plant, transgenic or non- 
transgenic, is treated with a protein or polypeptide of the present invention. By 
^ 10 treating, it is intended to include various forms of applying the protein or polypeptide 

j3 to the plant. The embodiments of the present invention where the effector 

?n polypeptide or protein is applied to the plant can be carried out in a number of ways, 

jlT including: 1) application of an isolated protein (or composition containing the same) 

'•f* or 2) application of bacteria which do not cause disease and are transformed with a 

3 15; gene encoding the effector protein of the present invention. In the latter embodiment, 

E the effector protein can be applied to plants by applying bacteria containing the DNA 

molecule encoding the effector protein. Such bacteria are preferably capable of 
=i . secreting or exporting the protein so that the protein can contact plant cells. In these 

embodiments, the protein is produced by the bacteria in planta. 
20 Such topical application is typically carried out using an effector fusion 

protein which includes a transduction domain, which will afford transduction domain- 
mediated spontaneous uptake of the effector protein into cells. Basically, this is 
carried out by fusing an 1 1 -amino acid peptide (YGRKKRRQRRR, SEQ. ID. No. 91) 
by standard rDNA techniques to the N-terminus of the effector protein, and the 
25 resulting tagged protein is taken up into cells by a poorly understood process. This 
peptide is the protein transduction domain (PTD) of the human immunodeficiency 
virus (HIV) TAT protein (Schwarze et al., 2000). Other PTDs are known and may 
possibly be used for this purpose (Prochiantz, 2000). 

When the effector protein is topically applied to plants, it can be 
30 applied as a composition, which includes a carrier in the form, e.g., of water, aqueous 
solutions, slurries, or dry powders. In this embodiment, the composition contains 
greater than about 5 nM of the protein of the present invention. 
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Although not required, this composition may contain additional 
additives including fertilizer, insecticide, fungicide, nematicide, and mixtures thereof. 
Suitable fertilizers include (NIL^NC^. An example of a suitable insecticide is 
Malathion. Useful fungicides include Captan. 
5 Other suitable additives include buffering agents, wetting agents, 

coating agents, and, in some instances, abrading agents. These materials can be used 
to facilitate the process of the present invention. 

According to another aspect of the present invention, a transgenic plant 
is provided that contains a heterologous DNA molecule that encodes a transcript or a 

10 protein or polypeptide capable of disrupting function of a plant pathogen CEL 

product. Because the genes in the CEL are particularly important in pathogenesis, 
disrupting the function of their products in plants can result in broad resistance since 
CEL genes are highly conserved among Gram negative pathogens, particularly along 
species lines. An exemplary protein or polypeptide which can disrupt function of a 

15 CEL product is an antibody, polyclonal or monoclonal, raised against the CEL 

product using conventional techniques. Once isolated, the antibody can be sequenced 
and nucleic acids synthesized for encoding the same. Such nucleic acids, e.g., DNA, 
can be used to transform plants. 

Transgenic plants can also be engineered so that they are 

20 hypersusceptible and, therefore, will support the growth of nonpathogenic bacteria for 
biotechnological purposes. It is known that many plant pathogenic bacteria can alter 
the environment inside plant leaves so that nonpathogenic bacteria can grow. This 
ability is presumably based on changes in the plant caused by pathogen effector 
proteins. Thus, transgenic plants expressing the appropriate effector genes can be 

25 used for these purposes. 

According to one embodiment, a transgenic plant including a 
heterologous DNA molecule of the present invention expresses one or more effector 
proteins, wherein the transgenic plant is capable of supporting growth of compatible 
nonpathogenic bacteria (i.e., non-pathogenic endophytes such as various Clavibacter 

30 ssp.). The compatible nonpathogenic bacteria can be naturally occurring or it can be 
recombinant. Preferably, the nonpathogenic bacteria is recombinant and expresses 
one or more useful products. Thus, the transgenic plant becomes a green factory for 
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producing desirable products. Desirable products include, without limitation, 
products that can enhance the nutritional quality of the plant or products that are 
desirable in isolated form. If desired in isolated form, the product can be isolated 
from plant tissues. To prevent competition between the non-pathogenic bacteria 
5 which express the desired product and those that do not, it is possible to tailor the 
needs of recombinant, non-pathogenic bacteria so that only they are capable of living 
in plant tissues expressing a particular effector protein or polypeptide of the present 
invention. 

The effector proteins or polypeptides of the present invention are 
10 believed to alter the plant physiology by shifting metabolic pathways to benefit the 
parasite and by activating or suppressing cell death pathways. Thus, they may also 
provide useful tools for efficiently altering the nutrient content of plants and delaying 
or triggering senescence. There are agricultural applications for all of these possible 
effects. 

15 A further aspect of the present invention relates to diagnostic uses of 

the CEL and EEL. The CEL genes are universal to species of Gram negative bacteria, 
particularly pathogenic Gram negative bacteria (such as P. syringae), whereas the 
EEL sequences are strain-specific and provide a "virulence gene fingerprint" that 
could be used to track the presence, origins, and movement (and restrict the spread 

20 through quarantines) of strains that are particularly threatening. Although the CEL 
and EEL have been identified in various pathovars of Pseudomonas syringae, it is 
expected that most all Gram-negative pathogens can be identified, distinguished, and 
classified based upon the homology of the CEL and EEL genes. 

According to one embodiment, a method of determining relatedness 

25 between two bacteria is carried out by comparing a nucleic acid alignment or amino 
acid alignment for a CEL of the two bacteria and then determining the relatedness of 
the two bacteria, wherein a higher sequence identity indicates a closer relationship. 
The CEL is particularly useful for determining the relatedness of two distinct bacterial 
species. 

30 According to another embodiment, a method of determining 

relatedness between two bacteria which is carried out by comparing a nucleic acid 
alignment or amino acid alignment for an EEL of the two bacteria and then 
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determining the relatedness of the two bacteria, wherein a higher sequence identity 
indicates a closer relationship. The EEL is particularly useful for determining the 
relatedness of two pathovars of a single bacterial species. 

Given the methods of determining relatedness of bacteria species 
5 and/or pathovars, these methods can be utilized in conjunction with plant breeding 
programs. By detecting the "virulence gene fingerprint" of pathogens which are 
prevalent in a particular growing region, it is possible either to develop transgenic 
cultivars as described above or to identify existing plant cultivars which are resistant 
to the prevalent pathogens. 

10 In addition to the above described uses, another aspect of the present 

invention relates to gene- and protein-based therapies for animals, preferably 
mammals including, without limitation, humans, dogs, mice, rats. The P. syringae pv. 
syringae B728a EEL ORF5 protein (SEQ. ID. No. 32) is a member of the 
AvrRxv/Y opj protein family. YopJ is injected into human cells by the Yersinia type 

15 III secretion system, where it disrupts the function of certain protein kinases to inhibit 
cytokine release and promote programmed cell death. It is believed that the targets of 
many pathogen effector proteins (i.e., P. syringae effector proteins) will be universal 
to eukaryotes and therefore have a variety of potentially useful functions. In fact, two 
of the proteins in the P. syringae Hrp pathogenicity islands are toxic when expressed 

20 in yeast. They are HopPsyA from the P. syringae pv. syringae EEL and HopPtoA 

from the P. syringae pv. tomato DC3000 CEL. This supports the concept of universal 
eukaryote targets. 

Thus, a further aspect of the present invention relates to a method of 
causing eukaryotic cell death which is carried out by introducing into a eukaryotic cell 

25 a cytotoxic Pseudomonas protein. The cytotoxic Pseudomonas protein is preferably 
HopPsyA (e.g., SEQ. ID. Nos. 36 (Psy 61), 62 (Psy 226), or 64 (Psy B143)) HopPtoA 
(SEQ. ED. No. 7), or HopPtoA2 (SEQ. ID. No. 66). The eukaryotic cell which is 
treated can be either in vitro or in vivo. When treating eukaryotic cells in v/vo, a 
number of different protein- or DNA-delivery systems can be employed to introduce 

30 the effector protein into the target eukaryotic cell. 
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Without being bound by theory, it is believed that at least the HopPsyA 
effector proteins exert their cytotoxic effects through Mad2 interactions, disrupting 
cell checkpoint of spindle formation (see infra). 

The protein- or DNA-delivery systems can be provided in the form of 
pharmaceutical compositions which include the delivery system in a pharmaceutically 
acceptable carrier, which may include suitable excipients or stabilizers. The dosage 
can be in solid or liquid form, such as powders, solutions, suspensions, or emulsions. 
Typically, the composition will contain from about 0.01 to 99 percent, preferably 
from about 20 to 75 percent of active compound(s), together with the carrier, 
excipient, stabilizer, etc. 

The compositions of the present invention are preferably administered 
in injectable or topically-applied dosages by solution or suspension of these materials 
in a physiologically acceptable diluent with a pharmaceutical carrier. Such carriers 
include sterile liquids, such as water and oils, with or without the addition of a 
surfactant and other pharmaceutically and physiologically acceptable carrier, 
including adjuvants, excipients or stabilizers. Illustrative oils are those of petroleum, 
animal, vegetable, or synthetic origin, for example, peanut oil, soybean oil, or mineral 
oil. In general, water, saline, aqueous dextrose and related sugar solution, and 
glycols, such as propylene glycol or polyethylene glycol, are preferred liquid carriers, 
particularly for injectable solutions. 

Alternatively, the effector proteins can also be delivered via solution or 
suspension packaged in a pressurized aerosol container together with suitable 
propellants, for example, hydrocarbon propellants like propane, butane, or isobutane 
with conventional adjuvants. The materials of the present invention also may be 
administered in a non-pressurized form such as in a nebulizer or atomizer. 

Depending upon the treatment being effected, the compounds of the 
present invention can be administered orally, topically, transdermally, parenterally, 
subcutaneously, intravenously, intramuscularly, intraperitoneally, by intranasal 
instillation, by intracavitary or intravesical instillation, intraocularly, intraarterially, 
intralesionally, or by application to mucous membranes, such as, that of the nose, 
throat, and bronchial tubes. 
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Compositions within the scope of this invention include all 
compositions wherein the compound of the present invention is contained in an 
amount effective to achieve its intended purpose. While individual needs vary, 
determination of optimal ranges of effective amounts of each component is within the 
5 skill of the art. 

One approach for delivering an effector protein into cells involves the 
use of liposomes. Basically, this involves providing a liposome which includes that 
effector protein to be delivered, and then contacting the target cell with the liposome 
under conditions effective for delivery of the effector protein into the cell. 

10 Liposomes are vesicles comprised of one or more concentrically 

ordered lipid bilayers which encapsulate an aqueous phase. They are normally not 
leaky, but can become leaky if a hole or pore occurs in the membrane, if the 
membrane is dissolved or degrades, or if the membrane temperature is increased to 
the phase transition temperature. Current methods of drug delivery via liposomes 

15 require that the liposome carrier ultimately become permeable and release the 
encapsulated drug at the target site. This can be accomplished, for example, in a 
passive manner wherein the liposome bilayer degrades over time through the action of 
various agents in the body. Every liposome composition will have a characteristic 
half-life in the circulation or at other sites in the body and, thus, by controlling the . 

20 half-life of the liposome composition, the rate at which the bilayer degrades can be 
somewhat regulated. 

In contrast to passive drug release, active drug release involves using 
an agent to induce a permeability change in the liposome vesicle. Liposome 
membranes can be constructed so that they become destabilized when the 

25 environment becomes acidic near the liposome membrane (see, e.g., Proc. Natl. Acad. 
Sci. USA 84:7851 (1987); Biochemistry 28:908 (1989), which are hereby 
incorporated by reference). When liposomes are endocytosed by a target cell, for 
example, they can be routed to acidic endosomes which will destabilize the liposome 
and result in drug release. 

30 Alternatively, the liposome membrane can be chemically modified 

such that an enzyme is placed as a coating on the membrane which slowly destabilizes 
the liposome. Since control of drug release depends on the concentration of enzyme 
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initially placed in the membrane, there is no real effective way to modulate or alter 
drug release to achieve "on demand" drug delivery. The same problem exists for pH- 
sensitive liposomes in that as soon as the liposome vesicle comes into contact with a 
target cell, it will be engulfed and a drop in pH will lead to drug release. 

This liposome delivery system can also be made to accumulate at a 
target organ, tissue, or cell via active targeting (e.g., by incorporating an antibody or 
hormone on the surface of the liposomal vehicle). This can be achieved according to 
known methods. 

Different types of liposomes can be prepared according to Bangham et 
al., (1965); U.S. Patent No. 5,653,996 to Hsu et al., U.S. Patent No. 5,643,599 to Lee 
et al.; U.S. Patent No. 5,885,613 to Holland et al.; U.S. Patent No. 5,631,237 to Dzau 
et al.; and U.S. Patent No. 5,059,421 to Loughrey et al. 

An alternative approach for delivery of effector proteins involves the 
conjugation of the desired effector protein to a polymer that is stabilized to avoid 
enzymatic degradation of the conjugated effector protein. Conjugated proteins or 
polypeptides of this type are described in U.S. Patent No. 5,681,81 1 to Ekwuribe. 

Yet another approach for delivery of proteins or polypeptides involves 
preparation of chimeric proteins according to U.S. Patent No. 5,817,789 to Heartlein 
et al. The chimeric protein can include a ligand domain and, e.g., an effector protein 
of the present invention. The ligand domain is specific for receptors located on a 
target cell. Thus, when the chimeric protein is delivered intravenously or otherwise 
introduced into blood or lymph, the chimeric protein will adsorb to the targeted cell, 
and the targeted cell will internalize the chimeric protein, which allows the effector 
protein to de-stabilize the cell checkpoint control mechanism, affording its cytotoxic 
effects. 

When it is desirable to achieve heterologous expression of an effector 
protein of the present invention in a target cell, DNA molecules encoding the desired 
effector protein can be delivered into the cell. Basically, this includes providing a 
nucleic acid molecule encoding the effector protein and then introducing the nucleic 
acid molecule into the cell under conditions effective to express the effector protein in 
the cell. Preferably, this is achieved by inserting the nucleic acid molecule into an 
expression vector before it is introduced into the cell. 
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When transforming mammalian cells for heterologous expression of an 
effector protein, an adenovirus vector can be employed. Adenovirus gene delivery 
vehicles can be readily prepared and utilized given the disclosure provided in 
Berkner, 1988, and Rosenfeld et al., 1991. Adeno-associated viral gene delivery 
vehicles can be constructed and used to deliver a gene to cells. The use of adeno- 
associated viral gene delivery vehicles in vitro is described in Chatterjee et al. 1992; 
Walsh et al. 1992; Walsh et al., 1994; Flotte et al., 1993a; Ponnazhagan et al., 1994; 
Miller et al., 1994; Einerhand et al., 1995; Luo et al., 1995; and Zhou et al., 1996. In 
vivo use of these vehicles is described in Flotte et al., 1993b and Kaplitt et al., 1994. 
Additional types of adenovirus vectors are described in U.S. Patent No. 6,057,155 to 
Wickham et al.; U.S. Patent No. 6,033,908 to Bout et al.; U.S. Patent No. 6,001,557 to 
Wilson et al.; U.S. Patent No. 5,994,132 to Chamberlain et al.; U.S. Patent 
No. 5,981,225 to Kochanek et al.; U.S. Patent No. 5,885,808 to Spooner et al.; and 
U.S. Patent No. 5,871,727 to Curiel. 

Retroviral vectors which have been modified to form infective 
transformation systems can also be used to deliver nucleic acid encoding a desired 
effector protein into a target cell. One such type of retroviral vector is disclosed in 
U.S. Patent No. 5,849,586 to Kriegler et al. 

Regardless of the type of infective transformation system employed, it 
should be targeted for delivery of the nucleic acid to a specific cell type. For 
example, for delivery of the nucleic acid into tumor cells, a high titer of the infective 
transformation system can be injected directly within the tumor site so as to enhance 
the likelihood of tumor cell infection. The infected cells will then express the desired 
effector protein, e.g., HopPtoA, HopPsyA, or HopPtoA2, disrupting cellular functions 
and producing cytotoxic effects. 

Particularly preferred is use of the effector proteins of the present 
invention to treat a cancerous condition (i.e., the eukaryotic cell which is affected is a 
cancer cell). This can be carried out by introducing a cytotoxic Pseudomonas protein 
into cancer cells of a patient under conditions effective to inhibit cancer cell division, 
thereby treating the cancerous condition. 

By introducing, it is intended that the effector protein is administered 
to the patient, preferably in the form of a composition which will target delivery to the 
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cancer cells. Alternatively, when using DNA-based therapies, it is intended that the 
introducing be carried out by administering a target DNA delivery system to the 
patient such that the cancer cells are targeted and the effector protein is expressed 
therein. 

Examples 

The following Examples are intended to be illustrative and in no way 
are intended to limit the scope of the present invention. 

Materials and Methods 

Bacterial Strains, Culture Conditions, Plasmids, and DNA Manipulation Techniques: 

Three experimentally amenable strains that represent different levels of 
diversity in P. syringae were investigated: Psy 61, Psy B728a, and Pto DC3000. 
(i) Psy 61 is a weak pathogen of bean whose hrp gene cluster, cloned on cosmid 
pHIRl 1, contains all of the genes necessary for nonpathogenic bacteria like 
Pseudomonas fluorescens and Escherichia coli to elicit the HR in tobacco and to 
secrete in culture the HrpZ harpin, a protein with unknown function that is secreted 
abundantly by the Hrp system (Alfano et al., 1996). The pHIRl 1 hrp cluster has been 
completely sequenced (Figure 1) (Alfano and Collmer, 1997), and the hopPsyA gene 
in the hypervariable region at the left edge of the cluster was shown to encode a 
protein that has an Avr phenotype, travels the Hrp pathway, and elicits cell death 
when expressed in tobacco cells (Alfano and Collmer, 1997; Alfano et al., 1997; van 
Dijk et al., 1999). (ii) Psy B728a is in the same pathovar as strain 61 but is highly 
virulent and is a model for studying the role of the Hrp system in epiphytic fitness and 
pathogenicity (brown spot of bean) in the field (Hirano et al., 1999). (iii) Pto DC3000 
is a well-studied pathogen of Arabidopsis and tomato (causing bacterial speck) that is 
highly divergent from pathovar syringae strains. Analysis of rRNA operon RFLP 
patterns has indicated that Pto and Psy are distantly related and could be considered 
separate species (Manceau and Horvais, 1997). Thus, we were able to compare two 
strains in the same pathovar with a strain from a highly divergent pathovar. 
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Conditions for culturing E. coli and P. syringae strains have been 
described (van Dijk et aL, 1999), as have the sources for Psy 61 (Preston et aL, 1995), 
Psy B728a (Hirano et aL, 1999), and Pto DC3000 (Preston et aL, 1995). Cloning and 

DNA manipulations were done in E. coli DH5a using pBluescript II (Stratagene, 

La Jolla, CA),.pRK415 (Keen et aL, 1988), and cosmid pCPP47 (Bauer and Collmer, 
1997), according to standard procedures (Ausubel et aL, 1994). Cosmid libraries of 
Pto DC3000 and Psy B728a genomic DNA were previously constructed (Charkowski 
et aL, 1998). Oligonucleotide synthesis and DNA sequencing were performed at the 
Cornell Biotechnology Center. The nucleotide sequence of the Pto DC3000 hrplhrc 
cluster was determined using subclones of pCPP2473, a cosmid selected from a 
genomic cosmid library based on hybridization with the hrpK gene of Psy 61 . The 
nucleotide sequence of the Psy B728a hrplhrc cluster was determined using subclones 
of pCPP2346 and pCPP3017. These cosmids were selected from a genomic library 
based on hybridization with the hrpC operon of 61 . The left side of the Psy 61 EEL 
region was cloned by PCR into pBSKSII+ Xhol and EcoRI sites using the following 
primers: 

SEQ. ID. NO. 71, which primes within queA and contains znXhol site: 

atgactcgag gcgtggattc aggcaaat 28 

SEQ. ID. NO. 72, which primes within hopPsyA and contains an EcoRI site: 

atgagaattc tgccgccgct ttctcgtt 28 

Pfu polymerase was used for all PCR experiments. DNA sequence data were 
managed and analyzed with the DNAStar Program (Madison, WI), and databases 
were searched with the BLASTX, BLASTP, and BLASTN programs (Altschul et aL, 
1997). 

Mutant Construction and Analysis: 

Large deletions in the Pto DC3000 Hrp Pai were constructed by 
subcloning border fragments into restriction sites on either side of an QSp R cassette in 
pRK415, electroporating the recombinant plasmids into DC3000, and then selecting 
and screening for marker exchange mutants as described (Alfano et aL, 1996). The 
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following left and right side (Figures 2 and 3) deletion border fragments were used 
(with residual gene fragments indicated): for CUCPB51 10 left tgt-gueA-iKNA- 1 ** 1 - 
ORF4' (27 bp of ORF4) and right ORF1 '-hrpK (396 bp of ORF1); and for 
CUCPB51 15 left hrpS'-avrE' (2569 bp of avrE) and right ORF6 (156 bp upstream of 
ORF6 start codon). The later fragment was PCR-amplified using the following 
primers: 

SEQ. ID. NO. 73, which primes in the ORF5-ORF6 intergenic region and contains an 
Xbal site: 

cgctctagac caaggactgc 2 0 

SEQ. ID. NO. 74, which primes in ORF6 and contains a HindQl site: 

ccagaagctt ctgtttttga gtc 23 

Mutant constructions were confirmed by Southern hybridizations using previously 
described conditions (Charkowski et al., 1998). The ability of mutants to secrete 
AvrPto was determined with anti-AvrPto antibodies and immunoblot analysis of cell 
fractions as previously described (van Dijk et al., 1999). Mutant CUCPB51 15 was 
complemented with pCPP3016, which carries ORF2 through ORF10 in cosmid 
pCPP47, and was introduced from E. coli DH5a by triparental mating using helper 
strain is. coli DH5ct(pRK600), as described (Charkowski et al., 1998). 

T7 Expression Analysis: 

Protein products of the Pto DC3000 EEL were analyzed by T7 
polymerase-dependent expression using vector pET21 and E. coli BL21(DE3) as 
previously described (Huang et al., 1995). The following primer sets were used to 
PCR each ORF from pCPP3091, which carries in pBSKSII+ aBamHl fragment 
containing tgt to hrcV: 



ORF1, SEQ. ID. Nos. 75 and 76, respectively: 

agtaggatcc tgaaatgtag gggcccgg 2 8 

agtaaagctt atgatgctgt ttccagta 28 

ORF2, SEQ. ID. Nos. 77 and 78, respectively: 

agtaggatcc tctcgaagga atggagca 28 



90 



agtaaagctt cgtgaagatg catttcgc 



28 



10 



ORF3, SEQ. ID. Nos. 79 and 80, respectively 

agtaggatcc tagtcactga tcgaacgt 
agtactcgag ccacgaaata acacggta 



28 
28 



ORF4, SEQ. ID. Nos. 81 and 82, respectively 

agtaggatcc caggactgcc ttccagcg 
agtactcgag cagagcggcg tccgtggc 



28 



28 



tnpA, SEQ. ID. Nos. 83 and 84, respectively 

agtaggatcc agaattgttg aagaaatc 
agtaaagctt tgcgctgtta actcatcg 



28 
28 



15 Plant Bioassays: 

Tobacco (Nicotiana tabacum L. cv. Xanthi) and tomato 
{Lycopersicon esculentum Mill. cvs. Moneymaker and Rio Grande) were grown 
under greenhouse conditions and then maintained at 25°C with daylight and 
supplemental halide illumination for HR and virulence assays. Bacteria were grown 

20 overnight on King's medium B agar supplemented with appropriate antibiotics, 

suspended in 5 mM MES pH 5.6, and then infiltrated with a needleless syringe into 
the leaves of test plants at 10 8 cfu/ml for HR assays and 10 4 cfu/ml for pathogenicity 
assays (Charkowski et al., 1998). All assays were repeated at least four times on 
leaves from different plants. Bacterial growth in tomato leaves was assayed by 

25 excising disks from infiltrated areas with a cork borer, comminuting the tissue in 
0.5 ml of 5 mM MES, pH 5.6, with a Kontes Pellet Pestle (Fisher Scientific, 
Pittsburgh, PA), and then dilution plating the homogenate on King's medium B agar 

with 50 |ag/ml rifampicin and 2 jig/ml cycloheximide to determine bacterial 
populations. The mean and SD from three leaf samples were determined for each 
30 time point. The relative growth in planta of DC3000 and CUCPB5 110 was similarly 
assayed in 4 independent experiments and the relative growth of DC3000, 
CUCPB5 1 1 5, and CUCPB5 1 1 5(pCPP301 6) in 3 independent experiments. Although 
the final population levels achieved by DC3000 varied between experiments, the 
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populations levels of the mutants relative to the wild type were the same as in the 
representative experiments presented below. 

Example 1 - Comparison of hrplhrc Gene Clusters of Psy 61, 
5 Psy B728a, and Pto DC3000 

To determine if the hrplhrc clusters from Psy B728a and Pto DC3000 
were organized similarly to the previously characterized hrplhrc cluster of Psy 61, 
two cosmids carrying hrplhrc inserts were partially characterized. pCPP2346 carries 
the entire hrplhrc cluster of B728a, and pCPP2473 carries the left half of the hrplhrc 

10 cluster of DC3000. The right half of the DC3000 hrplhrc cluster had been 

characterized previously (Preston et al., 1995). Sequencing the ends of several 
subclones derived from these cosmids provided fingerprints of the B728a and 
DC3000 hrp/hrc clusters, which indicated that both are arranged like that of strain 61 
(Fig. 1). However, B728a contains between hrcll and hrpVa 3.6-kb insert with 

15 homologs of bacteriophage lambda genes Ea59 (23% amino-acid identity; E = 2e-7) 
and Ea31 (30% amino-acid identity; E = 6e-8) (Hendrix et al., 1983), and the B728a 
hrcU ORF has 36 additional codons. A possible insertion of this size in several Psy 
strains that are highly virulent on bean was suggested by a previous RFLP analysis 
(Legard et al., 1993). Cosmid pCPP2346, which contains the B728a hrp/hrc region 

20 and flanking sequences (4 kb on the left and 13 kb on the right), enabled P. 

fluorescens to secrete the B728a HrpZ harpin in culture and to elicit the HR in 
tobacco leaves, however, confluent necrosis developed more slowly than with P. 
fluorescens(p¥LJRl 1) (data not shown). To further test the relatedness of the Psy 61 
and B728a hrplhrc gene clusters using an internal reference, the B728a hrpA gene 

25 was sequenced. Of the hrp/hrc genes that have been sequenced in Psy and Pto, hrpA, 
which encodes the major subunit of the Hrp pilus (Roine et al., 1997), is the least 
conserved (28% amino-acid identity) (Preston et al., 1995). However, the hrpA genes 
of strains 61 and B728a were 100% identical, which further supports the close 
relationship of these strains and their Hrp systems. 

30 
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Example 2 - Identification of an Exchangeable Effector Locus (EEL) in the Hrp 

Pai between hrpK and tRNA Leu 

Sequence analysis of the left side of the Psy 61, Psy B728a, and Pto 
DC3000 Hrp Pais revealed that the high percentage identity in hrpK sequences in 
these strains abruptly terminates three nucleotides after the hrpK stop codon and then 
is restored near tRNA 1 ^, queA, and tgt sequences after 2.5 kb (Psy 61), 7.3 kb (Psy 
B728a), or 5.9 kb (Pto DC3000) of dissimilar, intervening DNA (Figure 2). The 
difference between Psy strains 61 and B728a in this region was particularly 
surprising. This region of the P. syringae Hrp Pai was given the EEL designation 
because it contained completely different effector protein genes (Table 1 below), 
which appear to be exchanged at this locus at a high frequency. In this regard, it is 
noteworthy that (i) ORF2 in the B728a EEL is a homolog of avrPphE, which is in a 
different location, immediately downstream of hrpK (hrp Y), in Pph 1302 A (Mansfield 
et al., 1994), (ii) hopPsyA (hrmA) is present in only a few Psy strains (Heu and 
Hutcheson, 1993; Alfano et al., 1997), (iii) and ORF5 in the B728a EEL predicts a 
protein that is similar to Xanthomonas AvrBsT and possesses multiple motifs 
characteristic of the AvrRxv family (Ciesiolka et al., 1999). G+C content different 
from the genomic average is a hallmark of horizontally transferred genes, and the G + 
C contents of the ORFs in the three EELs are considerably lower than the average of 
59-61% for P. syringae (Palleroni et al., 1984) (Table 1 below). They are also lower 
than hrpK (60%) and queA (63-64%). The ORFs in the Pto DC3000 EEL predict no 
products with similarity to known effector proteins, however T7 polymerase- 
dependent expression revealed products in the size range predicted for ORF1, ORF3, 
and ORF4. Furthermore, the ORF1 protein is secreted in a /?r/?-dependent manner by 
E. cc?/i(pCPP2156), which expresses an Erwinia chrysanthemi Hrp system that 
secretes P. syringae Avr proteins (Ham et al., 1998). Several ORFs in these EELs are 
preceded by Hrp boxes indicative of HrpL-activated promoters (Figure 1) (Xiao and 
Hutcheson, 1 994), and the lack of intervening Rho-independent terminator sequences 
or promoters suggests that ORF1 in DC3000 and ORF1 and ORF2 in B728a are 
expressed from HrpL-activated promoters upstream of the respective hrpK genes. 

The EELs of these three strains also contain sequences homologous to 
insertion sequences, transposases, phage integrase genes, and plasmids (Figure 2 and 
Table 1 below). The Psy B728a ORFS and ORF6 operon is bordered on the left side 
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by sequences similar to those in a Pph plasmid that carries several avr genes (Jackson 
et al., 1999) and by a sequence homologous to insertion elements that are typically 
found on plasmids, suggesting plasmid integration via an IS element in this region 
(Szabo and Mills, 1984). Psy B728a ORF3 and ORF4 show similarity to sequences 
implicated in the horizontal acquisition of the LEE Pai by pathogenic E. coli strains 
(Pema et al., 1998). These Psy B728a ORFs are not preceded by Hrp boxes and are 
unlikely to encode effector proteins. 



Table 1 : ORFs and fragments of genetic elements in the EELs of Pto DC3000, Psy B728a, and 



ORFor 
sequence 


% 
G+C 


Size 


BLAST E value with representative similar sequence(s) in 
database, or relevant feature 


Pto DC3000 3 








ORF1 


55 


466 aa 


Hrp-secreted (Alfano, unpublished) 


TnpA' 


55 


279 aa 


le-125 P. stutzeri TnpAl (Bosch et al., 1999) 


ORF2 


51 


241 aa 


None 


ORF3 


53 


138 aa 


None 


ORF4 


47 


136 aa 


None 


Psv B728a 








ORF1 


51 


323 aa 


9e-40 Pph AvrPphC (Yucel et al., 1994) 


ORF2 


58 


382 aa 


le-154 Pph AvrPphE (Mansfield et al., 1994) 


ORF3 


55 


507 aa 


2e-63 E, coli LOO 15 (Perna et al., 1998) 


ORF4 


55 


118 aa 


9e-9 E. coli L0014 (Pema et al., 1998) 


ORF5 


49 


411 aa 


le-4 Xcv AvrBsT (Ciesiolka et al., 1999) 


ORF6 


52 


120 aa 


None 


B plasmid 


46 


96 nt 


1 e-25 Pph pAV5 1 1 (Jackson et al., 1 999) 


IntA' 


59 


49 aa 


3e-5 E. coli CP4-like integrase (Pema et al., 1998) 


Psv6\ 








HopPsyA 


53 


375 aa 


Hrp-secreted Avr (Alfano et al., 1997; van Dijk et al., 1999) 


ShcA 


57 


112 aa 


6e-4 Y0008 (Perry et al., 1998) 



Pathovar abbreviations correspond to the recommendations of Vivian and Mansfield (1993) for 
uniform avr nomenclature. 



The left border of the EELs contains sequences similar to many 



Leu 



tRNA genes and to E. coli queA and tgt queuosine biosynthesis genes (ca. 70% 
amino-acid identity in predicted products). The EEL sequences terminate at the 3* end 
of the P. syringae tRNA sequences, as is typical for Pais (Hou, 1999). Virtually 
identical tgt-queA-iRNA 1 * 1 * sequences are found in the genome of P. aeruginosa 
PAOl (www.pseudomonas.com), which is also in the fluorescent pseudomonad 
group. But PAOl is not a plant pathogen, and this tRNA Uu in P. aeruginosa is not 
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linked to any type III secretion system genes or other genes in the Hrp Pai (Figure 2). 
Thus, this is the apparent point of insertion of the Hrp Pai in the ancestral 
Pseudomonas genome. 

Example 3 - Identification of a Conserved Effector locus (CEL) Located on the 

Right Side of the Hrp Pai in Psy B728a and Pto DC3000 

Previous studies of the region to the right of hrpR in DC3000 had 
revealed the existence of the avrE locus, which is comprised of two transcriptional 
units (Lorang and Keen, 1995), the 5* sequences for the first 4 transcriptional units 
beyond hrpR (Lorang and Keen, 1995), and the identity of the fourth transcriptional 
unit as the hrp W gene encoding a second harpin (Charkowski et al., 1998). The DNA 
sequence of the first 14 ORFs to the right of hrpR in Pto DC3000 was completed in 
this investigation and the corresponding region in Psy B728a was partially sequenced 
(Figure 3). Like the EEL, this region contains putative effector genes, e.g., avrE 
(Lorang and Keen, 1995). Unlike the EEL, the ORFs in this region have an average 
G + C content of 58.0% , which is close to that of the hrp/hrc genes, the region 
contains no sequences similar to known mobile genetic elements, and it appears 
conserved between Psy and Pto (Figure 3). Comparison of the regions sequenced in 
B728a and DC3000 revealed that the first 7 ORFs are arranged identically and have 
an average DNA sequence identity of 78%. Hence, this region was given the CEL 
designation. 

The precise border of the CEL remains undefined, and no sequences 
that were repeated in the EEL border of the Hrp Pai were found. ORF7 and ORF8 are 

m 

likely to be part of the CEL, based on the presence of an upstream Hrp box (Figure 3). 
However, the region beyond ORF10 probably is not in the CEL because the product 
of the next ORF shows homology to a family of bacterial GstA proteins (e.g., 28% 
identity with E. coli GstA over 204 amino acids; E = le-8)(Blattner et al., 1997), and 
glutathione-iS-transferase activity is common in nonpathogenic fluorescent 
pseudomonads (Zablotowicz et al., 1995). The presence of agalP homo log (38% 
identity over 256 amino acids, based on incomplete sequence, to E. coli GalP; E = 2e- 
42) (Blattner et al., 1997) in this region further suggests that it is beyond the CEL. 

Several other features of this region in B728a and DC3000 are 
noteworthy, (i) Both strains have a 1-kb intergenic region between hrpR and ORF1 
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that is distinguished by low sequence identity (44%) but which contains three inverted 
repeats that could form stem loop structures affecting expression of the hrpRS operon. 
(ii) ORF1 is most similar to E. coli murein lytic transglycosylase MltD (38% identity 
over 324 amino acids; E — 4e-56). (iii) ORF2 is 42% identical over 130 amino acids 
with E. amylovora DspF (E = 9e-24), a candidate chaperone (Bogdanove et al., 
1998a; Gaudriault et al., 1997). (iv) The ORF5 protein is secreted in a /zr/?-dependent 

manner by E. co/z(pCPP2156), but mutation with an QSp r cassette has little effect on 
either HR elicitation in tobacco or pathogenicity in tomato (Charkowski, 
unpublished), (v) Finally, six operons in this region are preceded by Hrp boxes 
(Lorang and Keen, 1995) (Figure 3), which is characteristic of known avr genes in P. 
syringae (Alfano et al., 1996). Thus, the CEL carries multiple candidate effectors. 

Example 4 - Investigation of EEL and CEL Roles in Pathogenicity 

A mutation was constructed in DC3000 that replaced all of the ORFs 

between hrpK and tRNA 1 ^" (EEL) with an QSp r cassette (Figure 2). This Pto mutant, 

E 

CUCPB51 10, was tested for its ability to elicit the HR in tobacco and to cause disease 

* 

in tomato. The mutant retained the ability to elicit the HR and to produce disease 
symptoms, but it failed to reach population levels as high as the parental strain in 
tomato (Figure 4A). 

20 A mutation was constructed in DC3000 that replaced avrE through 

ORFS (CEL) with an QSp r cassette. This deleted all of the CEL ORFs that were both 
partially characterized and likely to encode effectors. This Pto mutant, CUCPB51 15, 
still elicited the HR in tobacco, but tissue collapse was delayed ca. 5 h (Figure 4C). 
The mutant no longer elicited disease symptoms in tomato when infiltrated at a 

25 concentration of 10 4 cfu/ml, and growth in planta was strongly reduced (Figure 4B). 
However, the mutant elicited an HR dependent on the tomato Pto R gene that was 
indistinguishable from the wild-type in tests involving PtoS (susceptible) and PtoR 
(resistant) Rio Grande tomato lines. Plasmid pCPP3016, which carries ORF2 through 
ORF10, fully restored the ability of CUCPB51 15 to cause disease symptoms and 

30 partially restored the ability of the mutant to multiply in tomato leaves (Figures 4B 
and 4E). Deletion of the hrplhrc cluster abolishes HR and pathogenicity phenotypes 
in Pto DC3000 (Collmer et al., 2000). To confirm that the large deletions in Pto 
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mutants CUCPB51 10 and CUCPB51 15 did not disrupt Hrp secretion functions, we 
compared the ability of these mutants, the DC3000 hrplhrc deletion mutant, and wild- 
type DC3000 to make and secrete AvrPto in culture while retaining a cytoplasmic 

marker comprised of P-lactamase lacking its signal peptide. AvrPto provided an ideal 
5 subject for this test because it is a well-studied effector protein that is secreted in 
culture and injected into host cells in planta (Alfano and Collmer, 1997; van Dijk et 
al., 1999). Only the hrplhrc deletion cluster mutant was impaired in AvrPto 
production and secretion (Figure 5). 

Based on the above studies, the P. syringae hrplhrc genes are part of a 

10 Hrp Pai that has three distinct loci: an EEL, the hrplhrc gene cluster, and a CEL. The 
EEL harbors exchangeable effector genes and makes only a quantitative contribution 
to parasitic fitness in host plants. The hrplhrc locus encodes the Hrp secretion system 
and is required for effector protein delivery, parasitism, and pathogenicity. The CEL 
makes no discernible contribution to Hrp secretion functions but contributes strongly 

15 to parasitic fitness and is required for Pto pathogenicity in tomato. The Hrp Pai of 
P. syringae has several properties of Pais possessed by animal pathogens (Hacker et 
al., 1997), including the presence of many virulence-associated genes (several with 
relatively low G+C content) in a large (ca. 50-kb) chromosomal region linked to a 
tRNA locus and absent from the corresponding locus in a closely related species. In 

20 addition, the EEL portion of the Hrp Pai is unstable and contains many sequences 
related to mobile genetic elements. 

The EEL is a novel feature of known Pais, which is likely involved in 
fine-tuning the parasitic fitness of P. syringae strains with various plant hosts. By 
comparing closely- and distantly-related strains of P. syringae, we were able to 

25 establish the high instability of this locus and the contrasting high conservation of its 
border sequences. No single mechanism can explain the high instability, as we found 
fragments related to phages, insertion sequences, and plasmids in the Psy and Pto 
EELs, and insertion sequences were recently reported in the corresponding region of 
three other P. syringae strains (Inoue and Takikawa, 1999). The mechanism or 

30 significance of the localization of the EELs between tRNA 1 ^ 11 and hrpK sequences in 
the Hrp Pais also is unclear. Pto DC3000 carries at least one other effector gene, 
avrPto, that is located elsewhere in the genome (Ronald et al., 1992), many 
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P. syringae avr genes are located on plasmids (Leach and White, 1996), and the EEL 
ORFs represent a mix of widespread, (e.g., avrRxv family) and seemingly rare (e.g., 
hopPsyA), effector genes. The G + C content of the EEL ORFs is significantly lower 
than that of the rest of the Hrp Pai and the P. syringae genome. Although certain 
5 genes in the non-EEL portions of the Hrp Pai, such as hrpA, are highly divergent, they 
have a high G + C content, and there is no evidence that they have been horizontally 
transferred separately from the rest of the Hrp Pai. The relatively low G + C content 
of the ORFs in the EELs (and of other P. syringae avr genes) suggests that these 
genes may be horizontally acquired from a wider pool of pathogenic bacteria than just 

10 P. syringae (Kim et al., 1998). Indeed, the avrRxv family of genes is found in a wide 
range of plant and animal pathogens (Ciesiolka et al., 1999). The weak effect on 
parasitic fitness of deleting the Pto DC3000 EEL, or of mutating hopPsyA (hrmA) in 
Psy 61 (Huang et al., 1991), is typical of mutations in individual avr genes and 
presumably results from redundancy in the effector protein system (Leach and White, 

15 1996). 

The functions of hrpK and of the CEL ORF1 are unclear but warrant 
discussion. These two ORFs reside just outside the hrpL and hrpR delimited cluster 
of operons containing both hrp and hrc genes and thereby spatially separate the three 
regions of the Hrp Pai (Figures 1-3). hrpK mutants have a variable Hrp phenotype 

20 (Mansfield et al., 1994; Bozso et al., 1999), and a Psy B728a hrpK mutant still 

secretes HrpZ (Alfano, unpublished), which suggests that HrpK may be an effector 
protein. Nevertheless, the HrpK proteins of Psy 61 and Pto DC3000 are 79% 
identical and therefore are more conserved than many Hrp secretion system 
components. It is also noteworthy that hrpK appears to be in an operon with other 

25 effector genes in Psy B728a and Pto DC3000. In contrast, the CEL ORF1 may 
contribute (weakly or redundantly) to Hrp secretion functions by promoting 
penetration of the system through the bacterial peptidoglycan layer. The ORF1 
product has extensive homology with E. coli MltD and shares a lysozyme-like domain 
with the product of ipgF (Mushegian et al., 1996), a Shigella flexneri gene that is also 

30 located between loci encoding a type III secretion system and effector proteins 
(Allaoui et al., 1993). Mutations in these genes in Pto and S. flexneri have no 



-98- 

obvious phenotype (Lorang and Keen, 1995; Allaoui et al. 5 1993), as is typical for 
genes encoding peptidoglycan hydrolases (Dijkstra and Keck, 1996). 

The loss of pathogenicity in Pto mutant CUCPB51 15, with an avrE- 
ORF5 deletion in the CEL, was surprising because pathogenicity is retained in 
5 DC3000 mutants in which the corresponding operons are individually disrupted 

(Lorang and Keen, 1995; Charkowski et al., 1998). In assessing the possible function 
of this region and the conservation of its constituent genes, it should be noted that 
avrE is unlike other avr genes found in Pto in that it confers avirulence to P. syringae 
pv glycinea on all tested soybean cultivars and it has a homo log (dspE) in 

10 E. amylovora that is required for pathogenicity (Lorang and Keen, 1995; Bogdanove 
et al., 1998b). Although the CEL is required for pathogenicity, it is not essential for 
type III effector protein secretion because the mutant still secretes AvrPto. It also 
appears to play no essential role in type III translocation of effector proteins into plant 
cells because the mutant still elicits the HR in nonhost tobacco and in a PtoR- 

15 resistance tomato line, and pHIRl 1, which lacks this region, appears capable of 

translocating several Avr proteins (Gopalan et al., 1996; Pirhonen et al., 1996). The 
conservation of this region in the divergent pathovars Psy and Pto, and its importance 
in disease, suggests that the products of the CEL may be redundantly involved in a 
common, essential aspect of pathogenesis. 

20 The similar G + C content and codon usage of the hrplhrc genes, the 

genes in the CEL, and total P. syringae genomic DNA suggests that the Hrp Pai was 
acquired early in the evolution of P. syringae. Although, the EEL region may have 
similarly developed early in the radiation of P. syringae into its many pathovars, 
races, and strains, the apparent instability that is discussed above suggests ongoing 

25 rapid evolution at this locus. Indeed, many P. syringae avr genes are associated with 
mobile genetic elements, regardless of their location (Kim et al., 1998). Thus, it 
appears that Hrp-mediated pathogenicity in P. syringae is collectively dependent on a 
set of genes that are universal among divergent pathovars and on another set that 
varies among strains even in the same pathovar. The latter are presumably acquired 

30 and lost in response to opposing selection pressures to promote parasitism while 
evading host 7?-gene surveillance systems. 
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Example 5 - Role of ShcA as a Type III Chaperone for the HopPsyA Effector 

The ORF upstream of hopPsyA, tentatively named shcA, encodes a 
protein product of the predicted molecular mass. The ORF upstream of the hopPsyA 
gene in P. s. syringae 61 (originally designated ORF1) shares sequence identity with 
exsC and ORF7, which are genes adjacent to type HI effector genes in P. aeruginosa 
and Yersinia pestis t respectively (Frank and Iglewski, 1991; Perry et al., 1998). 
Although neither of these ORFs have been shown experimentally to encode 
chaperones, they have been noted to share properties that type III chaperones often 
possess (Comellis et al., 1998). One of these properties is the location of the 
chaperone gene itself (Figures 1 and 6). Chaperone genes are often adjacent to a gene 
that encodes the effector protein with which the chaperone interacts. Furthermore, 
shcA also shares other common characteristics of type III chaperones: its protein 
product is relatively small (about 14 kDa), it has an acidic pi, and it has a C-terminal 
region that is predicted to be an amphipathic a-helix. To begin assessing the function 
of shcA, it was first determined whether shcA encodes a protein product. A construct 
was prepared using PCR that fused shcA in-frame to a sequence encoding the FLAG 
epitope. This construct, pLV26, contains the nucleotide sequence upstream of she A, 
including a putative ribosome binding site (RBS). DH5otF'IQ(pLV26) cultures were 
grown in rich media and induced at the appropriate density with IPTG. Whole cell 
lysates were separated by SDS-PAGE and analyzed with immunoblots using anti- 
FLAG antibodies. By comparing the ShcA-FLAG encoded by pLV26 to a construct 
that made ShcA-FLAG from a vector RBS, it was concluded that the native RBS 
upstream of shcA was competent for translation (Figure 7). Thus, the shcA ORF is a 
legitimate gene that encodes a protein product. 

To test the effects of shcA on bacterial-plant interactions, an shcA 
mutation was constructed in the minimalist hrp/hrc cluster carried on cosmid pHIRl 1. 
There are distinct advantages to having the she A mutation marker-exchanged into 
pHIRl 1 . The main one is that the HR assay can be used as a screen to determine if 
HopPsyA is being translocated into plant cells because the pHIRl 1 -dependent HR 
requires the delivery of HopPsyA into plant cells (Alfano et al., 1996; Alfano et al., 
1997). With the chromosomal shcA mutant, other Hop proteins would probably be 
delivered to the interior of plant cells. Some of these proteins would be recognized by 
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the R gene-based plant surveillance system and initiate an HR masking any defect in 
HopPsyA delivery. E. coli MC4100 carrying pLVIO, a pHIRl 1 derivative, which 
contains a nonpolar nptll cartridge within shcA 9 was unable to elicit an HR on tobacco 
(Figure 8). This indicates that shcA is required for the translocation of HopPsyA into 
plant cells. To determine if HopPsyA was secreted in culture, cultures of the 
nonpathogen P. fluoresceins 55 were grown. This bacterium carried either pHIRl 1 , 
pCPP2089 (a pHIRl 1 derivative defective in type III secretion), or pLVIO. The 
representative results can be seen in Figure 8. she A was required for the in-culture 
type III secretion of the HopPsyA effector protein, but not for HrpZ secretion, another 
protein secreted by the pHIRll encoded Hrp system. These results indicate that the 
defect in type HI secretion is specific to HopPsyA and are consistent with shcA 
encoding a chaperone for HopPsyA. It was after these results that the ORF upstream 
of the hopPsyA gene was named shcA for specific hop chaperone for HopPsyA, a 
naming system consistent with the naming system researchers have employed for 
chaperones in the archetypal Yersinia type III system. 

Example 6 - Cytotoxic Effects of hopPsyA Expressed in Plants 

Transient expression of hopPsyA DNA in planta induces cell death in 
Nicotiana tabacum, but not in N. benthamiana, bean, or in Arabidopsis. To determine 
whether HopPsyA induced cell death on tobacco leaves as it did when produced in 
tobacco suspension cells, a transformation system that delivers the hopPsyA gene on 
T-DNA of Agrobacterium tumefaciens was used (Rossi et al., 1993; van den 
Ackerveken et al., 1996). This delivery system works better than biolistics for 
transiently transforming whole plant leaves. For these experiments, vector pTA7002, 
kindly provided by Nam-Hai Chua and his colleagues at Rockefeller University, was 
used. The unique property of this vector is that it contains an inducible expression 
system that uses the regulatory mechanism of the glucocorticoid receptor (Picard et 
al., 1988; Aoyama and Chua, 1997; McNellis et al., 1998). pTA7002 encodes a 
chimeric transcription factor consisting of the DNA-binding domain of GAL4, the 
transactivating domain of the herpes viral protein VP 16, and the receptor domain of 
the rat glucocorticoid receptor. Also contained on this vector is a promoter containing 
GAL4 upstream activating sequences (UAS) upstream of a multiple cloning site. 
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Thus, any gene cloned downstream of the promoter containing the GAL4-UAS is 
induced by glucocorticoids, of which a synthetic glucocorticoid, dexamethasone 
(DEX), is available commercially. hopPsyA was PCR-cloned downstream of the 
GAL4-UAS. Plant leaves from several different test plants were infiltrated with 
Argrobacterium carrying pTA1002:\hopPsyA and after 48 hours these plants were 
sprayed with DEX. Only N. tabacum elicited an HR in response to the DEX-induced 
transient expression of hopPsyA (Figure 13 A). In contrast, N. benthamiana produced 
no obvious response after DEX induction (Figure 13B). Moreover, transient 
expression of hopPsyA in bean plants (Phaseolus vulgaris L. c Eagle')(data not shown) 
and Arabidopsis thaliana ecotype Col-1 (Figure 13) did not result in a HR. These 
results suggest that bean cv. Eagle, Arabidopsis Col-1, and N. benthamiana lack a 
resistance protein that can recognize HopPsyA. The lack of an apparent defense 
response for HopPsyA transiently expressed in bean was predicted, because HopPsyA 
is normally produced in P. s. syringae 61, a pathogen of bean. But, it was somewhat 
unknown how transient expression of HopPsyA would effect Arabidopsis. However, 
since P. s. tomato DC3000, a pathogen of Arabidopsis , appears to have a hopPsyA 
homolog based on DNA gel blots using hopPsyA as a probe, it was expected that 
HopPsyA would not to be recognized by an R protein in Arabidopsis (i.e., no HR 
produced) (Alfano et al., 1997). Thus, these plants (bean, Arabidopsis, and TV. 
benthamiana) should represent ideal plants to explore the bacterial-intended role of 
HopPsyA in plant pathogenicity. 

P.s. pv. syringae 61 secretes HopPsyA in culture via the Hrp (type III) 
protein secretion system. Because the P. syringae Avr proteins AvrB and AvrPto were 
found to be secreted by the type III secretion system encoded by the functional E. 
chrysanthemi hrp cluster carried on cosmid pCPP2156 expressed in E. coli (Ham et 
al., 1998), detection of HopPsyA secretion in culture directly via the native Hrp 
system carried in P. s. syringae 61 was tested. P. s. syringae 61 cultures grown in 
Ar/7-derepressing fructose minimal medium at 22°C were separated into cell-bound 
and supernatant fractions by centrifiigation. Proteins present in the supernatant 
fractions were concentrated by TCA precipitation, and the cell-bound and supernatant 
samples were resolved with SDS-PAGE and analyzed with immunoblots using anti- 
HopPsyA antibodies. A HopPsyA signal was detected in supernatant fractions from 
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wild type P. s. syringae 61 (Figure 14). Importantly, HopPsyA was not detected in 
supernatant fractions from P. s. syringae 61-2089, which is defective in Hrp secretion, 
indicating that the HopPsyA signal in the supernatant was due specifically to type III 
protein secretion (Figure 14). As a second control, both strains contained pCPP2318, 
which encodes the mature P-lactamase lacking its N-terminal signal peptide, and 
provides a marker for cell lysis. P-lactamase was detected only in the cell-bound 

fractions of these samples, clearly showing that cell lysis did not occur at a significant 
level (Figure 14). The fact that HopPsyA is secreted via the type HI secretion system 
in culture and that the avirulence activity of HopPsyA occurs only when it is 
expressed in plant cells strongly support that HopPsyA is delivered into plant cells via 
the type III pathway. 

HopPsyA contributes in a detectable, albeit minor, way to growth of P. 
s. syringae 61 in bean. The effect of a HopPsyA mutation on the multiplication of P. 
s. syringae 61 in bean tissue has been reported (Huang et al., 1991). These data 
essentially indicate that HopPsyA contributes little to the ability of P. s. syringae 61 
to multiply in bean. The P. s. syringae 61 hopPsyA mutant does not grow as well in 
bean leaves as the wild-type strain (Figure 15). This was unexpected, because these 
results are in direct conflict with previously reported data. One rationale for the 
discrepancy is that the previous reports focused primarily on the major phenotype that 
a hrp mutant exhibits on in planta growth and predated the discovery that HopPsyA 
was a type Ill-secreted protein. Thus, it is quite possible that the earlier experiments 
missed the more subtle effect that HopPsyA appears to have on the multiplication of 
P. s. syringae 61 in bean tissue (Huang et al., 1991). The data presented here supports 
that HopPsyA contributes to the pathogenicity of P. s. syringae and are consistent 
with the hypothesis that the majority of Hops from P. syringae contribute subtly to 
pathogenicity. The lack of strong pathogenicity phenotypes for mutants defective in 
different avr and hop genes may be due to possible avrlhop gene redundancy or a 
decreased dependence on any one Hop protein through coevolution with the plant. 

* 

Indeed, the type Ill-delivered proteins of plant pathogens that are delivered into plant 
cells may not be virulence proteins per se, but rather they may suppress responses of 
the plant that are important for pathogenicity to proceed (Jakobek et al., 1993). These 
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responses may be defense responses or other more general processes that maintain the 
status quo within the plant (e.g., the cell cycle). 

Example 7 - Molecular Interactions of HopPsyA 

5 HopPsyA interacts with the Arabidopsis Mad2 protein in the yeast 2- 

hybrid system. To determine a pathogenic target for HopPsyA, the yeast 2-hybrid 
system was used with cDNA libraries made from Arabidopsis (Fields and Song, 1989; 
Finley and Brent, 1994). In the yeast 2-hybrid system, a fusion between the protein of 
interest (the "bait") and the LexA DNA-binding domain was transformed into a yeast 

1 0 tester strain. A cDNA expression library was constructed in a vector that creates 

fusions to a transcriptional activator domain. This library was transformed into the 
tester strain en masse, and clones encoding partners for the "bait" are selected via 
their ability to bring the transcriptional activator domain into proximity with the DNA 
binding domain, thus initiating transcription of the LEU2 selectable marker gene. A 

15 second round screening of candidates, that activate the LEU2 marker, relies on their 
ability to also activate a lacZ reporter gene. Bait constructs were initially made with 
hopPsyA in the yeast vector pEG202 that corresponded to a full-length HopPsyA- 
LexA fusion, the carboxy-terminal half of HopPsyA fused to LexA, and the amino- 
terminal half of HopPsyA fused to LexA, and named these constructs pLV23, pLV24, 

20 and pLV25, respectively. However, pLV23 was lethal to yeast and pLV25 activated 
the lacZ reporter gene in relatively high amounts on its own (i.e., without the 
activation domain present). Thus, both pLV23 and pLV25 were not used to screen for 
protein interactors via the yeast 2-hybrid system. pLV24, which contains the 3 5 
portion of hopPsyA fused to lexA, proved to be an appropriate construct to use for bait 

25 in the yeast 2-hybrid system, because it did not autoactivate the lacZ reporter gene 
and, based on the lacZ repression assay using pJKlOl, the 'HopPsyA-LexA fusion 
produced by pLV24 appeared to localize to the nucleus. In addition, it was confirmed 
that pLV24 made a protein of the appropriate size that corresponds to HopPsyA by 
performing immunoblots with anti-HopPsyA antibodies on yeast cultures carrying 

30 this vector. 

Initial screens with pLV24 and Arabidopsis cDNA libraries in the 
yeast 2-hybrid vector pJG4-5. From three independent screens, several hundred 
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putative interactors with HopPsyA were identified, each activating the two reporter 
systems to varying degrees. When these putative positive yeast strains were 
rescreened and criteria were limited to interactors that strongly induced both the lacZ 
reporter and LEU2 gene in the presence of galactose, about 50 yeast strains were 
5 identified that appeared to contain pJG4-5 derivatives that encoded proteins that could 
interact with the C-terminal half of HopPsyA. DNA gel blots using PCR-amplified 
inserts from selected pJG4-5 derivatives as probes allowed each of these putative 
positives to be grouped. Approximately 50% of the pJG4-5 derivatives that encoded 
strong HopPsyA interactors belonged to the same group. A pJG4-5 derivative 

10 containing this insert, pLVl 16 was sequenced. The predicted amino acid sequence of 
the insert contained within pLV 116 shared high amino acid identity to Mad2 
homologs (for mitotic arrest deficient) found in yeast, humans, frogs, and corn. 
Moreover, based on amino acid comparison with the other Mad2 proteins, pLVl 16 
contains a cDNA insert that corresponds to the full-length mad2 mRNA. Table 2 

15 below shows the amino acid percent identity of all of the Mad2 homologs currently in 
the databases. 



Table 2: Percent Amino Acid Sequence Identity Between Different Mad2 Homologs* 



Mad2 
Homolog 


Arabidopsis 


Corn 


Human 


Mouse 


Frog 


Fission 
Yeast 


Budding 
Yeast 


Arabidopsis 
Corn 


81.3 














Human 


44.4 


44.9 












Mouse 


45.4 


45.9 


94.6 










Frog 


43.3 


42.9 


78.3 


77.3 








Fission 


40.4 


41.9 


43.8 


43.8 


46.3 






Yeast 
















Budding 


38.3 


38.8 


39.3 


39.3 


39.8 


45.4 




Yeast 

















* Comparisons were made with the MEGALIGN program at DNAStar (Madison, WI) using sequences 
present in Genbank. Abbreviations and accession numbers are as follows: Arabidopsis , A. thaliana 
Col-0 (this work); Corn, Zea mays (AAD30555); Human, Homo sapiens (NP_002349); Mouse, Mus 
musculus (AAD09238); Frog, Xenopus laevis, (AAB41527); Fission yeast, Schizosaccharomyces 
pombe (AAB68597); Budding yeast, Saccharamoyces cerevisiae (P40958). 



20 Not unexpectedly, the sequence of the Arabidopsis Mad2 protein is more closely 

related to the corn Mad2, the only plant Mad2 homolog represented in the databases. 
The corn Mad2 is about 82% identical to the Arabidopsis Mad2. Figures 16A-B show 
yeast strains containing either pLV24 and pJG4-5, pEG202 and pLVl 16, or pLV24 
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and pLVl 16 on leucine drop-out plates and plates containing X-Gal, showing that 
only when both HopPsyA and Mad2 are present, P-galactosidase and LEU2 activity 
are induced. It is important to note that the cDNA library that yielded mad2 has been 
used for many different yeast 2-hybrid screens and a madl clone has never been 
5 isolated from it before. Thus, the results shown in Figures 1 6A-B are unlikely to 

represent an artifact produced by the nature of the cDNA library. Moreover, different 
Mad2 homologs are known to interact with specific proteins and one of these 
homologs was isolated with a yeast 2-hybrid screen using a protein of the spindle 
checkpoint as bait (Kim et al., 1998). This is reassuring for two reasons. First, other 
q 10 Mad2 homologs do not appear to be nonspecifically "sticky" proteins. Second, they 

!;? appear to modulate cellular processes through protein-protein interactions. 

* ■ 

\U The above results are very promising, because Mad2 is a regulator 

*■ . 

if controlling the transition from metaphase to anaphase during mitosis, a key step in the 

^ cell cycle of eukaryotes. The eukaryotic cell cycle is dependent on the completion of 

* jl 

15 earlier events before another phase of the cell cycle can be initiated. For example, 
g before mitosis can occur DNA replication has to be completed. Some of these 

^ dependencies in the cell cycle can be relieved by mutations and represent checkpoints 

□ that insure the cell cycle is proceeding normally (Hartwell and Weinert, 1989). In 

pioneering work, Hoyt et al. and Li and Murray independently discovered that there is 
20 a checkpoint in place in Saccharomyces cerevisiae to monitor whether the spindle 
assembly required for chromosome segregation is completed (Hoyt et al., 1991; Li 
and Murray, 1991). This so-called spindle checkpoint was discovered when the 
observation was made that wild-type yeast cells plated onto media containing drugs 
. that disrupt microtubule polymerization arrested in mitosis, whereas certain mutants 
25 proceeded into anaphase. These initial reports identified 6 different nonessential genes 
that are involved in the spindle checkpoint: bubl-3 named for budding uninhibited by 
benzimidazole and madl-3 for mitotic arrest deficient. Mutations in these genes 
ignore spindle assembly abnormalities and attempt mitosis regardless. In the years 
since, the spindle checkpoint has been shown to be conserved in other eukaryotes and 
30 many advances have occurred resulting in a better picture of what is taking place at 
the spindle checkpoint (Glotzer, 1996; Rudner and Murray, 1996). 
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Required for the transition from metaphase to anaphase (as well as 
other cell cycle transitions) is the ubiquitin proteolysis pathway. Proteins that inhibit 
entry into anaphase (e.g., Pdsl in S. cerevisiae) are tagged for degradation via the 
ubiquitin pathway by the anaphase-promoting complex (APC) (King et al., 1996). 
5 Only when these proteins are degraded by the 26S proteosome are the cells allowed to 
cycle to anaphase. Although it is not well understood how the APC knows when to 
tag the anaphase inhibitors for degradation, there have been several important 
advances (Elledge, 1996; Elledge, 1998; Hardwick, 1998). The Mad2 protein and the 
Bubl protein kinase have been shown to bind to kinetochores when these regions are 

10 not attached to microtubules (Chen et aL, 1996; Li and Benezra, 1996; Taylor and 
McKeon, 1997; Yu et al., 1999). Thus, these proteins appear to somehow relay a 
signal that all of the chromosomes are not bound to spindle fibers ready to separate. 
Madl encodes a phosphoprotein, which becomes hyperphosphorylated when the 
spindle checkpoint is activated and the hyperphosphorylation of Madl is dependent 

15 on functional Bubl, Bub3, and Mad2 proteins (Hardwick and Murray, 1995). Another 
required protein in this checkpoint is Mpsl, a protein kinase that activates the spindle 
checkpoint when overexpressed in a manner that is dependent on all of the Bub and 
Mad proteins, indicating that Mpsl acts very early in the spindle checkpoint 
(Hardwick et al., 1996). 

20 Based on data from the different Mad2 homologs that have been 

studied, Mad2 appears to have a central role in the spindle checkpoint. Addition of 
Mad2 to Xenopus egg extracts results in inhibition of cyclin B degradation and mitotic 
arrest due to the inhibition of the ubiquitin ligase activity of the APC (Li et al., 1997). 
The overexpression of Mad2 from fission yeast causes mitotic arrest by activating the 

25 spindle checkpoint (He et al., 1997). Whereas, introducing anti-Mad2 antibodies into 
mammalian cell cultures causes early transition to anaphase in the absence of 
microtubule drugs, indicating that Mad2 is involved in the normal cell cycle. Several 
reports suggest that different Mad2 homologs directly interact with the APC (Li et al., 
1997; Fang et al., 1998; Kallio et al., 1998). Another protein called Cdc20 in S. 

30 cerevisiae binds to the APC, is required for activation of the APC during certain cell 
cycles, and Mad2 binds to it (Hwang et al., 1998; Kim et al., 1998; Lorca et al., 1998; 
Wassmann and Benezra, 1998). The picture that is emerging from all of these exciting 
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findings is that Mad2 acts as an inhibitor of the APC, probably by binding to Cdc20. 
When Mad2 is not present, the Cdc20 binds to the APC, which activates the APC to 
degrade inhibitors of the transition to anaphase. Figure 12 shows a summary of the 
spindle checkpoint focusing on Mad2's involvement and using the names of the 
spindle checkpoint proteins from *S. cerevisiae. 

The plant spindle checkpoint: A possible target of bacterial pathogens. 
Many of the cell cycle proteins from animals have homologs in plants (Mironov et al., 
1999). In fact, one of the early clues that there existed a spindle checkpoint was first 
made in plants. The observation noted was that chromosomes that lagged behind in 
their attachment to the spindle caused a delay in the transition to anaphase (Bajer and 
Mole-Bajer, 1956). Moreover, mad2 has been recently isolated from com and the 
Mad2 protein localization in plant cells undergoing mitosis is consistent with the 
localization of Mad2 in other systems (Yu et al., 1999). Based on a published 
meeting report, genes that encode components of the APC from Arabidopsis have 
been recently cloned (Inze et al., 1999). Thus, it appears that a functional spindle 
checkpoint probably is conserved in plants. The data presented above shows that the 
P. syringae HopPsyA protein interacts with the Arabidopsis Mad2 protein in the yeast 
2-hybrid system. 

It is possible that a pathogenic strategy of a bacterial plant pathogen is 
to alter the plant cell cycle. Duan et al. recently reported that pthA, a member of the 
avrBs3 family of avr genes from X. citri, is expressed in citrus and causes cell 
enlargement and cell division, which may implicate the plant cell cycle (Duan et al., 
1999). If HopPsyA does target Mad2, at least two possible benefits to pathogenicity 
can be envisioned. Since plant cells in mature leaves are quiescent, one benefit of 
delivering HopPsyA into these cells may be that it may trigger cell division through 
its interaction with Mad2. This is consistent with the observation that anti-Mad2 
antibodies cause an early onset of anaphase in mammalian cells (Gorbsky et al., 
1998). More plant cells near the pathogen may increase the nutrients available in the 
apoplast. A second possible benefit may occur if HopPsyA is delivered into plant cells 
actively dividing in young leaves. Delivery of HopPsyA into plant cells of these 
leaves may derail the spindle checkpoint through its interaction with Mad2. These 
cells would be prone to more mistakes segregating their chromosomes; in some cells 
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this would result in death and the cellular contents would ultimately leak into the 
apoplast providing nutrients for the pathogen. 

Example 8 - Cytotoxic Effects of HopPtoA and HopPsyA Expressed in Yeast 

Both hopPtoA (SEQ. ID. No. 6) and hopPsyA (SEQ. ID. No. 35) were 
first cloned into pFLAG-CTC (Kodak) to generate an in-frame fusion with the FLAG 
epitope, which permitted monitoring of protein production with anti-FLAG 
monoclonal antibodies. The FLAG-tagged genes were then cloned under the control 
of the GAL1 promoter in the yeast shuttle vector p415GALl (Mumberg et al. 5 1994). 
These regulatable promoters of Saccharomyces cerevisiae allowed comparison of 
transcriptional activity and heterologous expression. The recombinant plasmids were 
transformed into uracil auxotrophic yeast strains FY833/4, selecting for growth on 
SC-Ura (synthetic complete medium lacking uracil) based on the presence of the 
URA3 gene on the plasmid. The transformants were then streaked onto SC-Ura 
medium plates containing either 2% galactose (which will induce expression of 
HopPsyA and HopPtoA) or 2% glucose. No growth was observed on the plates 
supplemented with 2% galactose. This effect was observed with repeated testing and 
was not observed with empty vector controls, with four other effectors similarly 
cloned into p415GALl, or when raffinose was used instead of galactose. FLAG- 
tagged nontoxic Avr proteins were used to confirm that the genes were differentially 
expressed, as expected, on plates containing galactose. Importantly, the toxic effect 
with HopPsyA was observed when the encoding gene was recloned into p416GALS, 
which expresses foreign genes at a substantially lower level than p415GALl . 
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